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■ ABSTRACT 

I We forecast the constraints on the values of cg, f^m, and cluster scaling relation parameters 

O ■ which we expect to obtain from the XMM Cluster Survey (XCS). We assume a flat ACDM 

L| ' Universe and perform a Monte Carlo Markov Chain analysis of the evolution of the num- 

. \ ber density of galaxy clusters that takes into account a detailed simulated selection function. 

■ Comparing our current observed number of clusters shows good agreement with predictions. 
' We determine the expected degradation of the constraints as a result of self-calibrating the 

luminosity-temperature relation (with scatter), including temperature measurement errors, 
and relying on photometric methods for the estimation of galaxy cluster redshifts. We examine 
the effects of systematic errors in scaling relation and measurement error assumptions. Using 
only (r, z) self-calibration, we expect to measure ilm to ±0.03 (and f^A to the same accu- 
racy assuming flatness), and erg to ±0.05, also constraining the normalization and slope of the 
luminosity-temperature relation to ±6 and ±13 per cent (at Icr) respectively in the process. 
Self-calibration fails to jointly constrain the scatter and redshift evolution of the luminosity- 
temperature relation significantly. Additional archival and/or follow-up data will improve on 
this. We do not expect measurement errors or imperfect knowledge of their distribution to 
degrade constraints significantly. Scaling-relation systematics can easily lead to cosmological 
constraints 2a or more away from the fiducial model. Our treatment is the first exact treatment 
to this level of detail, and introduces a new 'smoothed ML' estimate of expected constraints. 

Key words: cosmological parameters - cosmology: observations - cosmology: theory - 
galaxies: clusters: general - methods: statistical - X-rays: galaxies: clusters 
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1 INTRODUCTION 

The abundance of galaxy clusters as a function of mass and redshift 
can give a powerful constraint on cosmological models. Specifi- 
cally, data on the evolution of the number density of galaxy clusters 
with redshift has been used to obtain direct estimates for both ag, 
the dispersion of the mass field smoothed on a scale of 8 /i^^ Mpc, 
and on ^m , the p resen t mean m ass density of the Universe 
jFrenketalJ 119901: louk bir & Blan chard Il992l: Iviana & Liddli " 
199e JOukbir & Blan chard 1997; Henrv'l997";'Bahcall, Fan & Ce 
1997 : lEkeet alJ Il 998; Reichart et al. 1999; Donahue & Voit 
,1999; Viana&Liddle 1999; Blanc hard et al.h2 000; Henrv 2000; 
Borgani et al.i ,200 1,: .Refregier. Valtchanov & Pierre ,2002 : .Henrvi 



2004 iGladders et al.n2007l : IRozo et alj |0073k Furthermore, 
such data could be used to constrain the present energy 
density of a dark energy com ponent , Q.^, a nd its equa- 
tion o f state dWang & SteinhardJ 11998: Haiman, Mohr & Holder 

ILf 



20011: Huterer & Turne 



J '200 ll; 



Levine, Sch ulz & Whitet 



Weller. B attve & Kneissl 2002; Battye & Wel leiil2003l:lHul 



2002 



2003; 



Maiumdar & Mohr 2003, 120041 : IWang et alfbOO^ : iLima & hJ 
200 5: Mant_z et al. 2008), or more simply the present vacuum 
energy d ensity associated with a cosmo logical constant, SIa = 
A/3Ho taolder. Haiman & Moh3l200lh . Others have suggested 
using galaxy clu sters to constrain particle physics beyond the Stan - 
dard Model (e.g.'Wang et alj2005l : lErlich, Glover & Weinej200"8h . 
or modified-gravity models where it has been shown that e.g. the 
Dvali-Gabadadze-Porrati (P GP) modified-gravity model should 
be tes table in coming surveys dXang et alj20o3:ISchafer & Kovamal 
1200^ . An alternative method to abundance evolution using X- 
ray galaxy cluster s to constrain cosmology, is based on the gas 
mass fraction (e.g. Allen. Schmidt & Fabiad 20021: Vikhlinin et al.l 



20031: lEttori. Tozzi & Rosatil 12003 : Rapetti. Allen & Welled I200I: 



Vikhlinin et alll2006l : lAllen et alj.2008i : iRapetti et al.ll2008h 



Galaxy cluster measurements are complementary to other 
cosmological constraints derived from the Cosmic Microwave 
Background (CMB) and distant Type la Supernovae observa- 
tions, and thus help bre ak degeneracies am o ngst the various cos 
mological parameters dBahcall et al. 1999; Haiman et al. 2001 



Huterer & Tumdl200ll:lLevine et al.,2002 : . Battve & Weller„2003 



Melchiorri et alj2003l : IWang et alj20o'4f) 



Several surveys have been proposed with the explicit aim 
of significantly increasing the number of known distant clusters 
of galaxies. These proposals rely on a variety of detection meth- 
ods across a wide range of wavelengt hs: the Sunyaev-Zel'dovich 
(SZ) effect in the milli meter (see ICarlstrom. Holder & Reesa 
I2OO2I for a review, and ljuin et al.1 l2005l for a list of pro- 
pose d surveys); galaxy overdensities in the visible/infrare d 
(e.g. lGladders&^l2005l : iHsieh et all lioo^ : IRozo et al.llo07bh : 
bremsstrahl ung emission by the intracluster medium (IC M) in the 
X-rays (e.g.lJaho da & the DUET collaboration 2003; Hai man et al.l 



I2OO a; IPierre et al.ll2008l) . Galaxy cluster identificati on using wea k 
lensing techniques is another possibility (e.g. Wittman et alj2006h . 
but is still in its infancy. Many of these proposals, in particular 
those regarding the detection of distant clusters through their X- 
ray emission, irnply the bi iilding of new observing facilities such 
as eROSITA ( jPredehl et alj|200d) . that will likely take many years 
to yield results. The cluster X-ray temperature is one of the best 
proxy observables in lieu of mass; it is a better est imator of the 
cluster mass th an the cluster X-ray luminosity (e.g. iBalogh et al.l 
I2OO6I : IZhang et al. 2006) (but more difficult to determine), and 
galaxy clusters are also most unambiguously identified in X-ray 
images. This makes X-ray-based galaxy cluster surveys those with 



the most accurately determined selection function. For all these rea- 
sons, we have undertaken to construct a galaxy cluster catalogue, 
called XCS: XMM Cluster Survey, based on the serendipitous iden- 
tification of galaxy clusters in public XMM-Newton {XMM) data 
jRomer et alfcoOll) . 

The aim of this paper is to forecast the expected galaxy cluster 
samples from the XCS and, based on those, its ability to constrain 
cosmology and cluster scaling relations using only self-calibration. 
Specifically, we consider the expected constraints on fim, erg and 
the luminosity-temperature relation for a flat Universe. Our results 
represent the statistical power expected to be present in the full 
XMM archive. This work builds upon previous efforts in several 
ways, and to a large extent constitutes the first coherent treatment of 
effects and methods previously only considered separately. Specif- 
ically, we combine all the following characteristics: 

(i) we use a Monte Carlo Markov Chain (MCMC) approach and 
can thus characterize all degeneracies exactly (in contrast to Fisher 
matrix analyses), 

(ii) we include scatter in scaling relations in the parameter esti- 
mation (enabled by MCMC), 

(iii) we include a detailed, simulated selection function (essen- 
tially that of the XMM archive), not a simple hard flux/photon- 
count/mass limit, 

(iv) we include realistic photometric redshift errors, including 
degradation and catastrophic errors, 

(v) we include temperature measurement errors, partly based on 
detailed simulations of XMM observations, and propagate the red- 
shift errors to the temperature, and, 

(vi) we investigate quantitatively the effect on cosmological 
constraints from systematic errors in cluster scaling relation and 
measurement error characterization. 

Our work builds on the galaxy cluster sur vey exploita- 
t ion methods developed an d studied 
200 1[) : [Holder et al.H200lh: 



Levine ( 



d primarily i n [Ha iman et al 
et al.l j2002h : Kravtsov 



(12003): Hu ( 2003); Battve & Weller (2003); Maiumdar & Mohr 
(12003, 2004); Lima & Hu (2004); Wang et al. (2004); Lima 
( 120051) . Forecasted cosmological constraints from XMM data have 
also been considered for the XMM-LSS survey in lRefregier et al.l 
([2002), but they did not take into account scaling-relation scatter 
or measurement errors, and used the Press-Schecht er mass func- 
tion. T he most relevant precursors to this paper are iHaiman et al.l 
( I2OOII) and iMaiumdar & Mohj ( |2004|) . who consider cosmologi- 
cal constraints expect ed from the Dark Uni verse Exploration Tele- 
scope (DUET) ( Jahoda & the DUET collab oration 2003) - a 10000 
deg^ X-ray survey with flux limit ~ 5 x 10^^* erg s^^ cm^^ 
in the 0.5-2 keV band. We extend the methodology of both pa- 
pers through each of the six points above, either by more detailed 
modeling or by obtaining more r obust results t hrough the use of 
MCMC. Ot h er rele vant works are lHuterer et alJ (2004, 2006) and 
iLima & Hi] j2007l) . who discuss photometric redshifts. We par- 
ticularly complement these analyses through our detailed treat- 
ment/inclusion of measurement err ors and selection effects. The 
recent work by Rape tti et alj ( l2008h takes an approach similar to 
ours in that they employ MCMC, include scaling-relation scatter 
and consider measurement errors, but focuses on combining future 
X-ray gas mass fraction measurements with SZ cluster and CMB 
power spectrum data. 

The structure of this paper is as follows. We begin by review- 
ing the progress to date of the XCS and present the survey selection 
function (Sect. |2}. Next, we present the models and methodology 
we use to derive constraints on cosmological parameters from the 
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simulated XCS sample (Sects. [3] & EJ- We then go on to the ex- 
pected cluster distributions and, our estimates for the constraints on 
as, fim, and cluster scaling relation parameters that we expect to 
obtain from the XCS using self-calibration, including the effect of 
temperature measurement errors and relying on photometric meth- 
ods to obtain XCS galaxy cluster redshifts (Sect.[5]l. We discuss and 
sunmiarize our findings in Sect. |6] Additional material setting out 
modeling details is provided in the Appendix. 



2 THE XMM CLUSTER SURVEY 
2.1 Background and current status 

XMM-Newton is the most sensitive X-ray spectral imaging tele- 
scope deployed to date. It is typically used in pointing mode, 
whereby it observes a single central target for a long period of 
time (the typical exposure time being ~ 20 kilo-seconds). The 
field of view of the XMM cameras is roughly half a degree across, 
so that a considerable area around the central target is observed 
'for free' during these long pointings. Already many thousands of 
these pointings are available in the public XMM archive. The XCS 
is exploiting this archive by carrying out a systematic search for 
serendipitous de tections of clusters of galaxies in the outskirts of 
XMM pointings jRomer et al1l200lh . Once a cluster candidate has 
been selected from the archival imaging data, it is then followed up 
using optical imaging and/or optical spectroscopy, to confirm the 
indentification of the X-ray source and to measure redshifts (see 
Sect. 13.41 ). For those XCS clusters that were detected with sufficient 
counts, an X-ray spectroscopy analysis is carried out, again using 
the archival data, in order to measure the temperature of the hot 
intracluster medium (ICM). These temperatures can then be used 
to study cluster scaling relations and/or to estimate the mass of the 
cluster (see Sects. [X2l & l3.3t . 

The XCS project is ongoing, but already more than 2000 
XMM pointings have been analysed, yielding a cluster candidate 
catalogue numbering almost 2000 entries. So far, the XCS cov- 
ers a combined area of 132 deg^ suitable for cluster searching and 
for which optical follow-up has been completed; i.e. this area ex- 
cludes overlapping and repeat exposures, regions of low Galac- 
tic latitude, the Magellanic clouds, and pointings with very ex- 
tended central targets. Around 75-100 clusters with > 500 pho- 
tons and T > 2 keV are present in this initial area. With many 
thousand more XMM pointings waiting to be analysed by the XCS, 
and a mission lifetime extending to 2013, a conservative estimate 
for the final XCS area for cluster searching is 500 deg^. We use 
500 deg^ herein for XCS cosmology forecasting (see Table [T](, 
assume a redshift range of 0.1 < z < 1, and temperatures of 
2 keV < r < 8 keV. We further limit our representative survey to 
clusters with photon counts > 500 (^""XCS hereafter), so that we 
can be sure to estimate X-ray temperatures with reasonable accu- 
racy (see Sect. |3.5b . The lower redshift limit is associated with clus- 
ter extents becoming too large, and the cosmic volume also becom- 
ing small. The maximum redshift is chosen so that the luminosity- 
temperature relation can still be reliably modelled/estimated (see 
Sect. I3.3.3t . The temperature range is chosen such that we can 
expect i) a small contamination from galaxy groups (which typi- 
cally have temperatures T < 2 keV), yet include as many of the 
numerous low-temperature clusters as possible, and ii) that clus- 
ters above the high-temperature limit are sufficiently rare that none 
can be expected. The final cluster catalogue (without the cut-offs 
defined above for ^''"XCS) will contain several thousand clusters 



Sui-vey ^""XCS 

Sky coverage 500 deg^ (serendipitous) 

Redshift coverage 0.1-1.0 

X-ray temperature coverage 2-8 keV 

Min. photon count 500 

X-ray flux limit By selection function^ 

^ The flux limit is ~ 3.5 X lO^^'' ergs~^cm^^ in the 
[0.1, 2.4] keV band, if defined as a probability of detection 
greater than or equal to 50 per cent. See also Sect. 15. 2.7] and 
Fig.E 

Table 1. Survey specifications. 

out to a redshift of z ~ 2. The highest-redshift cluster discov- 
ered by the XCS so far is XMMXCS J 2215.9-1738 at z = 1.457 
jStanford et id]|200d : iHilton et al.ll2007l) . 

In addition to producing one of the largest samples of X-ray 
clusters ever compiled, the XCS will also be a valuable resource for 
cosmology studies (see Sect.Q. This is because the catalogue will 
be accompanied by a complete description of the selection func- 
tion. In this work we make use of an initial XCS selection function 
that assumes simple models for the distribution of the ICM, and 
flat cosmologies (see below). Future cosmology analyses will take 
advantage of more sophisticated selection functions that are based 
on hydrodynamical simulations of clusters jKav et alj2007l) . 

2.2 The XCS selection function 

2.2.1 Model 

In order to properly model the selection function of a survey like 
the XCS, it is important to account for all of the observational 
variations present in real data. We can achieve this by placing a 
sample of fake surface-brightness profiles into real XMM Observa- 
tion Data Files (ODFs). This ensures that our simulated images re- 
create real-life issues such as clusters lying on chip gaps and point- 
source contamination. The fake surface-brightness profiles are cre- 
ated as follows. We use an isothermal (3 model with (3 = 2/3, 
core radius rc — 160 kpc (close to the mean values of f3 — 0.64, 
Tc = 163 kpc obtained from a uniform ROSAT analysis of clusters 
from 0.1 < 2: < 1.0; Ota & Mitsuda 2004), and plasma metallic- 
ity Z — O.SZq. For a given cosmology we simulate 700 sets of 
cluster parameters: 

• 10 redshifts (linearly spaced 0.1-1.0) 

• 10 luminosities (log. spaced 0.178-31.623 x 10** erg s~*) 

• 7 temperatures (linearly spaced 2-8 keV) 

For selection function determination, we drew on a list of 1764 
ODFs that have already been processed by the XCS and have been 
deemed to be suitable for cluster searching (see above). Before each 
selection function run, a smaller list of 100 ODFs is selected at 
random from the full set of 1764. These 100 ODFs are then copied 
from the main XCS archive to local processing nodes for temporary 
storage, to speed up the analysis. Tests have shown that with 100 
ODFs it is still possible to reproduce the variance in exposure time, 
target type, point source density, etc., inherent to the XCS. In the 
following we define a 'selection function run' as the analysis over 
the 700 sets of cluster parameters and 100 ODFs - a total of 70000 
combinations. 

For each of the 700 different combinations of cluster parame- 
ters, the process proceeds as follows. First, to account for the fact 
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(a) Constant L-T relation 



(b) Self-similar L-T relation 



Figure 1. Selection function for our fiducial cosmology and different L-T evolution. Values in the shaded region are extrapolated from those in the coloured 
region (0.1 < 2 < 1.0, 2keV < T < 8keV), for which the selection function has been calculated explicitly. 



that the XCS searches the entire field of view for serendipitous clus- 
ter detections, the centre of the fake surface-brightness profile is 
randomly positioned into a blank XMM-style ODF, with a uniform 
probability across the field of view. The profile is then convolved 
with the appropriate PSF model. For this purpose we use the two- 
dimensional medium-accuracy modeQ. At this stage, an ODF is 
chosen at random from the list of 100 stored locally, into which 
the fake source will later be added. The profile is then assigned 
an absorbed count rate using a series of arrays calculated using 
XSPEC ( Arnaud 19 96). The arrays tabulate conversions from unab- 
sorbed bolometric luminosity to absorbed count-rate as a function 
of temperature, redshift, hydrogen column density, and XMM cam- 
era/filter combination. The fake count-rate image is then multiplied 
by the exposure map of the chosen ODF to account for vignetting, 
masking and chip gaps. Finally, the fake cluster image is added to 
the original ODF at the chosen position, and the ODF is run through 
our source detection/classification pipeline to determine if the fake 
cluster passes our automated cluster-candidate selection process. 
For more details on the detection/classification pipeline, refer to 
Davidson et al. (in preparation). The process is repeated a total of 
one hundred times, so that we can build up an average XCS de- 
tectability for that parameter combination. Once the full set of 700 
combinations has been tested 100 times each, the run is complete. 
We then change the cosmology inputs and start the entire sequence 
again. The process is very CPU intensive; each selection function 
run (of 700 x 100 combinations) takes several weeks to run on a 
single node. For the forecasting work presented herein, we carried 
out seven selection function runs over the flat ACDM cosmologies 
withfim = 0.22,0.26, 0.28,0.30,0.32, 0.34 and 0.38. We limit 
ourselves to flat cosmologies as we use a flatness prior in the fore- 
casting of cosmological constraints. 

The resulting selection function is shown in Fig. [T] for the 
two luminosity-temperature relations (see Sect. l3.3.Tt we consider. 
Note that the selection function in regions where we have not cal- 
culated it explicitly is extrapolated from the region where we have 
done so. Hence, its features in those extrapolated regions should 
only be considered a rough indication of its behaviour, particularly 



in the high-redshift, high-temperature region. This region is only 
relevant for including measurement errors, and since such high- 
temperature clusters are exceedingly rare, the uncertainty in this 
part of the selection function has no significant impact on our re- 
sultfl 



2.2.2 Uncertainty 

The shape of the selection function is dependent on the cluster 
model employed, as described above. It is well known that clusters 
of galaxies have a range of morphologies, with core radii varying 
from many tens of kpc to a few hundred k pc and 13 values varying 
generally between 0.45 and 0.85 (e.g.,Reii3rich & Bohringeill2002l : 



i -.KeiPri 
20081) . 

To include the variation of cluster-model parameters in our 
analysis in a realistic manner, one would require i) a model for the 
distribution of such parameters among the cluster population (in- 
cluding correlations among parameters), and ii) a characterization 
of the selection function dependence on such parameters. Lacking 
either or both of these will produce some level of uncertainty in 
cluster number predictions and cosmological parameter constraints. 
However, assessing the level of such uncertainty of course requires 
a fiducial model (realizing i) and ii)) to compare with. As we do 
not currently have a realistic model for the model-parameter dis- 
tribution among the cluster population, it is somewhat premature 
to carry out such an analysis. In actual data analysis, we intend to 
model this in detail. What we can currently do is to compare our 
standard selection function to one assuming that all clusters have 
the most extreme values of cluster-model parameters, leading to 
a gross overestimation of the overall uncertainty in cluster num- 
ber predictions. We have carried this calculation out for clusters 
with temperatures typical for the underlying distribution at differ- 
ent redshifts, and describe it below. We again stress that its useful- 
ness for estimating the actual uncertainty in the selection function 
is limited, as it does not take into account the actual distribution 
of clusters and their model parameters. Ultimately, we expect that 



http://xmm.vilspa.esa.es/extemal/xmm_sw_cal/calib/ 



We have subsequently verified the validity of the extrapolation to this 
level of accuracy with new calculations. 
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the cosmological constraints we obtain would change little, even 
if a more realistic model/selection function was used. This is be- 
cause the changes in the selection function would have a similar 
impact for the fiducial cosmological model, and for models in its 
neighbourhood. 

We have tested the sensitivity of the selection function for 
clusters with > 500 photons to variations in the cluster core radius 
Tc, between the values of 60 kpc and 260 kpc (recall the fiducial 
value used in this work is 160 kpc, see Sect. |2.2.Tl . For this we use 
mock clusters with typical temperatures of T = 3 keV (and hence 
luminosities) for a given redshift (as predicted by our models, see 
Sects.[3]&|4ll. In the following, we refer to the relative difference 
in the selection function detectability, as this is most relevant to the 
relative difference in numbers of clusters. Our results show that, 
for most of the redshift range tested (0.1 < z < 1), clusters with 
a core radius of ~ 140 kpc are easier to detect (as extended XMM 
sources), than those with smaller or larger core radii. However, the 
dependence is shallow; the relative uncertainty in the detectability 
is less than 10 per cent up to a redshift of z ~ 0.4, across the entire 
Tc range. At higher redshifts, the relative uncertainty approaches 
30-40 per cent. However, this subset of clusters constitutes only 
~ 30 per cent of the total population. For higher-temperature clus- 
ters, the relative uncertainty drops back to around 10 per cent at 
0.1 < z < 1 for 4keV < T < 5 keV. 

In summary, our model for the cluster population is a simplifi- 
cation based on mean observational values of cluster-model param- 
eters, and as a result will have somewhat differing detection prop- 
erties compared to a real sample. To characterize such uncertainty 
requires modelling of the cluster-model parameter distribution, and 
the selection function dependence on those parameters. However, 
once such information becomes available, it will be included in the 
analysis and hence remove/reduce such uncertainty. As we do not 
currently have a realistic model of the cluster parameter distribu- 
tion, we have determined the impact of assuming extreme struc- 
tural cluster parameters, and found that at most the typical selec- 
tion function uncertaint y is of the order of 1 per cent. These re- 
sults agree with those of iBurenin et alj ( |2007|) for the 400d survey, 
which show that reasonable variations in cluster size, morphology 
and scaling relations induce an uncertainty in the detectability for 
a given flux of typically less than 5 per cent. 



3 FROM X-RAY OBSERVABLES TO MASS 
3.1 Modeling summary 

Making predictions for X-ray cluster observations requires the 
modeling of scaling relations to relate temperature to mass, and 
temperature to luminosity. In addition, the observables will have 
uncertainties associated with them, which need to be taken into ac- 
count. The following subsections detail our modeling assumptions, 
but we summarize them here for reference and orientation. 

We first assume that we know a priori exactly how the clus- 
ter X-ray temperature relates to luminosity at the present time, and 
how this relation evolves with redshift. We then study how the con- 
straints on cosmological parameters degrade if such an assumption 
is dropped. We consider four extra free parameters: two parameters 
to characterize the present-day, power-law, relation between cluster 
X-ray temperature and luminosity, another to describe its redshift 
evolution as a power of (1 + 2), and lastly one for the logarithmic 
dispersion in the (assumed) Gaussian distribution of the intrinsic 
(redshift-independent) scatter in the relation between cluster X-ray 
temperature and luminosity. 



In addition, we evaluate the full impact on the XCS's ability 
to impose constraints on cosmological parameters that arises from 
assuming a dispersion in the Gaussian photometric redshift distri- 
bution of either 5 or 10 per cent about the true redshift, both with 
and without the presence of unaccounted-for catastrophic errors in 
the photometric redshift estimation procedure. Further, we will also 
determine the impact of a systematic mis-estimation of the assumed 
true dispersion in the photometric redshifts about the true redshift. 
Our aim is to test the impact of realistic assumptions regarding the 
distribution of photometric redshifts around the true redshift, and 
then determine by how much such impact increases by considering 
a worst-case scenario. 

Similarly, we consider the impact of realistic X-ray tempera- 
ture errors obtained from simulations based on the relevant XMM 
fields, as well as significantly larger errors corresponding to a 
worst-case scenario. Lastly, we consider the impact of incorrect as- 
sumptions about the cluster scaling relations on cosmological con- 
straints. 

Summary tables with our main cluster scaling relation and 
measurement error assumptions are given in Sect. |5] Detailed in- 
formation on the mathematical treatment is given in the Appendix. 



3.2 The X-ray temperature to mass relation 

We need to assume a relation between cluster X-ray temperature 
and mass to be able to predict cluster distributions. The reason is 
that presently the effect of cosmological parameters on the galaxy 
cluster population c an only be accurately predic ted as a function 
of cluster mass (e.g. i Reiprich & Bohringeill2002l) . The X -ray tem- 
perature is one of the best proxy observables, as explained in the 
Introduction. 



3.2.1 Evolution 



We assume the self-similar pr ediction (e.g. iKaiseil 1 1 9861 : 
lBrvan& Normal]! 19981 : r\^l2005ah . 



(1) 



for the redshift dependence of the relation between cluster 
X-ray temperature and virial mass to hold for any combi- 
nation of cosmological parameters, given that it is consis tent 
with t he most recent analyses of observational data dsttori et aH 
2004allb i: lAma ud. Pointecouteau & Prattll2005l: iKotov & Vikhlinirj 
20051 2006: Vik hlinin et ai]|2006l : Izhang et al.ll2006h . Here Mv is 
the cluster virial mass, while Av {z) is the mean overdensity within 
the cluster virial radius with respect to the critical density. If the 
only relevant energy densities in the Universe are those associated 
with non-relativistic matter and a cosmological constant, then 



E^{z) = f7„,(l + zf + + zf + , 



(2) 



with f2fe = 1 — Sim — f^A- (Note that we will restrict ourselves to a 
flat Universe, flk ~ 0, in our analysis - see Sect. l4.2t . Deviations 
from a self-similar mass-temperature relation will be considered in 
Sect. |5.5l as explained in the following Section. 



3.2.2 Normalization 

The constant of proportionality is set by demanding that for our 
fiducial cosmological model (with erg — 0.8, see Sect. l4.2t 



M500 = 3 X 10" h'^ Mq 



(3) 
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at z = 0.05 for an X-ray temperature of 5 keV, where M500 is the 
mass within a sphere centered on the cluster within which its mean 
density falls to 500 times the critical density at the cluster redshift. 
In this way, our fiducial cosmological model reproduces the local 
abund ance of galaxy clu sters as gi ven by the HIFLUGCS cata- 
logue teeiprich & Bohring er 2002: PierpaoU. Scott & Whitj200ll : 
IVianaetalJl2003h . Note that such a normalization of the cluster 



X-ray temperature to mass relation hap pens to be very clo se to 
that directly derived f rom X-ray data bv lAmaud et al.l j2005l) and 
IVikhlinin et aP (12006). 

The conversion between A/500 and the halo m ass, Migpn.^ fzi, 
will be carried out by using the formulae derived bv lHu & Kravtsovl 
( l2003h unde r the assumption th at the halo de nsity profile is of the 
NFW type l lNavarro. Frenk & White 1995, 1 9961 . 1 19971) , and we 
will take the concentration parameter to be 5. This has been shown 
to provide a good description o f the typical density profile in galaxy 
clusters (see[A maud 200j or lVoill l2005bl and references therein; 
IVikhlininet ^12006)) . 

The normalization of the M-T relation is subject to a num- 
ber of uncertainties, the most import ant of which are the possible 
violation of hydrostatic equilib rium feas ia. Tormen & Moscardinil 
|2004 iNagai. Kravtsov & Vikhl inin 2007) and the possible differ- 
ence between the spectroscopi c X-ray tem perature and the temper- 
ature of the electron gas ( Mazzotta et al.l 2004: Rasiaetal. 2 00^ ; 
IVikh linin 2006). The precise level of these effects remains to be 
firmly established , but could be of the order 50% i n the normal- 
ization mass (e.g. IVikhlininll20o3 : lNagai et al.ll2007h . The scatter, 
as w ell as slope, could also be u nder-estimated due to these effects 
Jmhlinin 20 0^; lNagai et alj2007,) . We make some estimates of all 
these systematic effects on cosmological constraints in Sect. l5.5l 



3.2.3 Scatter 

We assume that the intrinsic scatter in the relation between clus- 
ter X-ray temperature and mass has a Gaussian distribution (trun- 
cated at 3cr and re-normalized) with a redshift-independent disper- 
sion of 0.10 about the logarithm of the tempe rature. This is moti- 
vated by both cluster X-ray data analysi s (e.g. lAmaud et al.ll2005l : 
IVikhlinin etai]|2006l : IZhang et alj|2006h and results from A^-body 
hydrodynamic simulations (e.g. IViana et al. 20031; Borgani et all 
I2OO4I ; iBalogh et al.ll2006l ; iKravtsov, Vikhlinin & Nagaill2006l) . As 
explained in the preceding Section, we consider systematic devia- 
tions in the scatter in Sect. 15.51 



3.3 The X-ray luminosity to temperature relation 

In order to understand how the XCS selection function depends on 
cluster mass, we need to know how cluster X-ray luminosity and 
temperature relate to cluster mass (see Sect. 12. 2t . In practice, we 
will use the relation between luminosity and temperature instead 
of that between luminosity and mass, in effect relating these two 
quantities via the temperature. This makes sense because the es- 
timation of cluster mass from X-ray data is always based on the 
X-ray temperature, via the assumption of hydrostatic equilibrium, 
and not on the luminosity. Thus, while we always need, at least im- 
plicitly, to know how the cluster luminosity relates to temperature 
to derive the relation between the luminosity and mass from X-ray 
data, the reverse is not true. 

As for the mass-temperature relation, assuming self-similarity 



leads to a specific prediction l lKaiserll 19861) . 

K{z)E\z) 



L{z,T) = L(0.05,r) 



Av(0.05)i;2(0.05) 



1/2 



(4) 



under which clusters with the same X-ray temperature are predicted 
to be more X-ray luminous if they have a higher redshift. We have 
chosen here to normalize the relation with respect to the local {z = 
0.05) relation. Based on this expression, we write the L-T relation 
in the general form 



logio 



10*4/1-2 ergs- 



= Q + /3 logi 



kT 
IkeV 



+ 



(5) 



7elogio [^^{z)E^{z)\ +7.1ogio(l + ^)+N(0,aiogi^). 

and discuss below the assumptions made for the different parame- 
ters. 



3. 3. 1 Evolution (7s , 7z ) 

We consider two possible fiducial scenarios, which bracket most 
observational results and theoretical expectations: either 

• no evolution (7^ = 7z = 0) or 

• self-similar evolution (7s = 1/2, 7^ — 0) 

for the fiducial combination of cosmological parameters. The pa- 
rameters 7s and 7^ are defined above in equation (O. Presently, 
there is some uncertainty surrounding the redshift evolution of 
the relation between cluster X-ray luminosity and temperature. 
Essentially, what we know is how that relation behaves for red- 

2003; iNovicki. Sornig & Heiirvl 



Ikebe et al 



Zhang et al 



l2006h . For higher redshifts, the 



shifts below 0.3 (e.g . 
l2002l ; IOta et"^l2006l : 
data is still sparse, and the evidence contradictory, from claims 
that the relation between cluster X-ray luminosit y and temperature 
barely e volves at all with redshift ([llo lden et al. 1 20021 ; lEttori et al] 
l2004allbl lota et alj|200^ ; iBranchesiet al.„2007i). to claims that its 



evolution is close to the self-similar predi ction iNovicki et al 



2002 



2005 



Vikhlinin et al. 2002; Lumb et al. 2004; Kotov & Vikhlinin] 
Maughan et al. 2006; Zhang et al. 2006; Hicks et al. 2008). Some 
authors argue that self-similarity remains viable at all redshifts, and 
that at least some of the observed discrepancies could be due to se- 
lection effects, as the Malmquist bias from scaling-relation scatter 
(also discussed below) could distort the deduced evolution if the 
sample selection is not sufficiently understood (e.g.lBranchesi et"ail 



l2007l ; lMaugharj2007l;|Pacaud et al.l2007l ; lNord et al.|2008l) . On"the 
other hand. IHilton et alj i2007h argue for deviation from the self- 
similar prediction based on a set of high-redshift clusters combined 
with the recenfly discovered XCS cluster XMMXCS J2215.9-1738 
at 2: = 1.457. 

When the XCS catalogue becomes available, the relation be- 
tween cluster X-ray luminosity and temperature, as a function of 
redshift, will be estimated jointly with the cosmological parame- 
ters, but for now we will have to rely on the limited information 
available. 



3.3.2 Normalization & slope (a, (3) 

We assume the local (z — 0.05) relation between the cluster X-ray 
luminosity in the ROSAT [0.1, 2.4] keV band and temperature to be 



loe 



ergs-i 



42.1 + 2.5 logi 



kT " 
IkeV 



(6) 
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as was derived in IViana et alj ( l2003h for a combination of cos- 
mological parameters similar to tfiose assum ed for our fiducia l 
cosmological model. The X-ray data used in I Viana et al 
was that of galaxy clusters p resent in the HIFLUGCS catalogue 
jReiprich & BohringeJ l2002h . and therefore the conversion be- 
tween Lx and X-ray bolometric luminosity is performed through a 
fit (derived by us) based on the values both quantities take for the 
galaxy clusters in HIFLUGCS, 



(7) 



0.25 + 0. 7 exp(-0.23A;T/lkeV) . 

As in Ilkebeetaljj2003) . the relation between the cluste r X-ray lu- 
minosity and temperature derived in I Viana et al.l ( 1200 3h takes into 
account the fact that any flux-limited sample of galaxy clusters will 
be composed of objects which are on average more X-ray luminous 
than the mean luminosity of all existing galaxy clusters with the 
same redshift and X-ray temperature. This Malmquist type of bias 
increases with decreasing temperature, and thus ignoring it leads 
not only to an overestimation of the normalization of the relation 
between luminosity and temperature, but also to an underestima- 
tion of its slope. 

3.3.3 Scatter {aiog Lx) 

We assume that the intrinsic scatter in the relation between cluster 
X-ray luminosity (in the 0.1 to 2.4 keV band) and temperature has a 
redshift-independent Gaussian distribution (truncated at 3a and re- 
normalized) about the logarit hm of the X-ray l uminosity, with la 
dispersion criogLx = 0.30 (ilkebe et al.ll2002l: IViana et al.ll2003h . 
This is also close to what was found by iKav et al.1 i2m% in the 
CLEF simulation. Although Kay et al. also observe an evolution of 
the scatter with redshift, there is no strong observational evidence 
for or against such an evolution at present, and therefore we do not 
include it in our analysis. 

The existence of intrinsic scatter in the relation between clus- 
ter luminosity and mass will effectively increase the observed num- 
ber of galaxy clusters above any X-ray luminosity (or flux) thresh- 
old, relative to the case without scatter. This results from the steep- 
ness of the cluster mass function, due to which significantly more 
clusters have their X-ray luminosity scattered up than down across 
any given luminosity threshold. Therefore, intrinsic scatter between 
X-ray luminosity and mass can have a considerable impact on the 
predicted number of XCS clusters and on the estimation of the 
constraints the XCS will impose on cosmological parameters. This 
scatter can be considered as the combination of the scatter in the 
luminosity to temperature and temperature to mass relations, with 
cle ar observational evidence that the form er dominates over the lat- 
ter dStanek et alj2006l : IZhang et alj|2006h . 

As higher redshifts are considered, it is expected that an in- 
creasing number of galaxy clusters will have undergone recent 
major mergers, not only leading to increased scatter in the clus- 
ter scaling relations but also making its distribution highly non- 
Gaussian, with long tails developing towards both high X-ray 
lumi nosity and, to a lesser degree, temperature, at fixed mass 
( iRan dall. Sarazin & Ricker 2002). This has the potential to sub- 
stantially affect the estimation of the constraints the XCS will be 
able to impose on cosmological parameters. There is a lack of high- 
redshift observational data in this regard and we are also not con- 
fident that we will detect, for the purposes of understanding this 
behaviour, many useful clusters at jz > 1. We therefore chose to 
consider in the estimation procedure only those clusters in the mock 
XCS catalogues which have a redshift z < 1. 



3.4 Photometric redshifts 

3.4.1 The role of photometric redshifts 

Redshifts are required for XCS clusters, both to place them cor- 
rectly in the evolutionary sequence and to allow the measurement 
of X-ray temperatures from XMM spectra. With regard to the lat- 
ter point, pure thermal bremsstrahlung spectra are essentially fea- 
tureless (barring a high-energy cut-off), making them degenerate 
in temperature and redshift. Therefore, in the absence of indepen- 
dent redshift information, all one can measure from a typical XCS 
cluster spectrum would be a so-called apparent X-ray tempera- 
ture, i.e. one scaled by (1 + z), see Appendix I A 1.21 As shown by 
iLiddle et al. Ii liooih . these apparent temperatures are not sufficient 
to allow one to measure cosmological parameters from cluster cat- 
alogues. As a result, optically-determined redshifts will be required 
for almost all clusters in the XCS catalogue (the exception being a 
tiny number that are detected with sufficient signal to noise to allow 
X-ray emission features, such as the Iron K complex at ~ 7 keV, to 

be resolved over the thermal continuum). 

As is now typical for cluster surveys (e.g. iGladders & Yed 

l2005l) . the XCS is relying heavily on the photometric redshift 
technique for its optical follow-up. This is because photometric 
redshifts are much more efficient, in terms of telescope time re- 
quirements, than spectroscopic redshifts. However, they have the 
disadvantage, over spectroscopic redshifts, that the redshift errors 
are larger and sometimes poorly understood. The XCS is using 
both public-domain photometry (e.g. from SDSS and 2MASS) and 
proprietary data from the NOAO-XCS Survey (NXS. lMiller et all 
1 2006) to both optically confirm (as clusters) XCS candidates and to 
measure photometric redshifts. To date, more than 400 XCS candi- 
dates have been optically confirmed in this way. 

Errors on photometric redshifts must be accounted for when 
determining cosmological parameters from cluster surveys, and so 
we have included prescriptions for such errors in the forecasting 
work presented herein. Our prescriptions include both purely statis- 
tical errors and so-called catastrophic systematic errors. As shown 
by previous work (Huterer et al. 2004, 2006; Lima & Hu 200"^, 
purely statistical errors have a negligible impact on cosmological 
parameter constraints. By contrast, if catastrophic errors are not ac- 
counted for properly in the fitting, they could have a significant im- 
pact on cosmological parameter constraints. We note that previous 
work has concentrated only on the impact of redshift errors on the 
evolutionary sequence, whereas we have also included the impact 
of photometric errors on X-ray temperature determinations. 



3.4.2 Distribution 



Following 'Huterer et 'Zl ( |2004|) we assume that the statistical er- 
ror in the photometric redshifts of individual galaxy clusters has 
a Gaussian distribution about the true redshift, zt. In an attempt 
to reproduce the expected degradation with redshift of the abso- 
lute accuracy of cl uster photometric re dshift estimation methods, 
and in contrast to iHuterer et all ( 12004 ') (but in the same way as 
iLima & Hu 200i), we will assume the dispersion to be propor- 
tional to (1 + Zt), normalized at Zt = to either gp = 0.05 or 
CTQ = 0.10 dGladders & Yeel2000l ; lGladdersl2004l : lGladders 
1 20051 : [oiadders et alj 20071) . Unaccounted-for systematic errors in 
the photometric redshift estimation procedure are much harder to 
model, because they can take a variety of guises. We will consider 
here one such type of error: catastrophic errors in the photomet- 
ric redshift estimation procedure. The existence of unaccounted-for 
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Figure 2. Realistic redsliift error distributions at various redsliifts. Tlie up- 
per riglit panel shows a magnification of the bottom-right distribution, high- 
lighting the catastrophic-error part of the distribution. 




".= 0.40.30, 
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Figure 3. Mean fractional temperature errors from the simulations per- 
formed, for 500 photons, and as marginalized over expected absorption 
columns for the XCS. 



catastrophic errors will be modelled by assigning a random photo- 
metric redshift error to either a fraction /cat ~ 0.05 or /cat = 0.10 
of the galaxy clusters, drawn from a Gaussian distribution that has 
four times the dispersion of the standard distribution, with the re- 
quirement that the photometric redshift error has to be at least la 
away from the true redshift. The functional form of the redshift er- 
ror distribution is given in Appendix lAll 

We label the case {ctq — 0.05, /cat ~ 0.05} 'realistic' and the 
case {(70 — 0.10, /cat ~ 0.10} 'worst-case' redshift errors. Exam- 
ples of realistic redshift error distributions are shown in Fig. [2] 

3.5 X-ray temperature 

3.5. 1 Estimating the measurement errors 

Initial estimates jLiddle et alj|200lh showed that X-ray tempera- 
tures measured for XCS clusters are expected to have an associated 
measurement uncertainty of less than 20 per cent at Icr. However, 
these estimates were based on a photon count of 1000 and assume 
a single hydrogen column density over the XMM fields, and are 
therefore not directly applicable to our ^""XCS sample. Hence, 
in order to estimate the temperature errors that will be present in 
the XCS statistical sample more accurately, we have conducted 
Mon te Carlo simu lations using the XSPEC spectral fitting pack- 
age ( lAmaudll996[) . We created 1000 sets of fake spectra for the 
XMM-Newton EPIC PN and MPS instruments, fr om a MEKAL 
plasma model dMewe. Lemen & van denOorJl98^ multiplied by 
a WA BS photo-electric absorption model dMorrison & McCammonI 
Il983h . Responses for a mean off-axis angle were used and a mean 
background was added. The model was then fitted to each of the 
spectra to derive a temperature. A plasma metallicity of 0.3Zq was 
used throughout, in accordance with the assumptions in our selec- 
tion function calculations (see Sect. 12. 2t , and we assume a photon 
count of 500. This procedure was repeated for a range of input tem- 
peratures, redshifts and absorption columns. We then marginalize 
over the hydrogen absorption columns using the expected hydrogen 
column distribution for our XMM fields. 

The mean fractional temperature errors from our simulations 
are shown in Fig. [3] The largest influence on the temperature errors 
comes from the input temperature itself. Since metal lines in the 
spectrum provide much better constraints on the temperature than 
the shape of the bremsstrahlung continuum, and the fraction of line 
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Figure 4. Realistic temperature error distributions at various redshifts and 
temperatures, based on our XMM-Newton simulations, for a photon count 
of 500. 



emission in the spectrum declines with increasing temperature, the 
errors are larger for hotter clusters. The effect of redshift on the 
errors is much smaller and itself temperature-dependent. For low- 
temperature systems at high redshifts, part of the X-ray spectrum 
is shifted out of the bottom of the XMM passband, increasing the 
errors. For high-temperature systems, the effect of increasing red- 
shift is to shift the source spectrum to lower energies for which the 
XMM effective area is larger, thus decreasing the errors. 



3.5.2 Distribution 

The distribution of temperatures obtained in our simulations was 
fitted by an asymmetric Gaussian function to parametrize the tem- 
perature error distribution, with the fractional error given by a two- 
dimensional quadratic expression in temperature and redshift. We 
marginalize over the distribution of absorption columns found in 
XCS fields to obtain mean parameters for our asymmetric Gaus- 
sian error distribution. The exact functional form of the fitted error 
distribution is given in equation iA8l in Appendix lAll 

We will label the case with ar according to our simulation 
results as 'realistic' and the case with three times this dispersion 
as 'worst-case' temperature errors. Examples of realistic temper- 
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ature error distributions are shown in Fig. [4] Note tliat, as we are 
assuming that all detected clusters have a photon count of exactly 
500, our error distributions represent a worst-case scenario in this 
regard. 



4 FROM MASS TO COSMOLOGY AND CONSTRAINTS 
4.1 The mass function 

Having connected our direct X-ray observables to cluster mass us- 
ing the methodology in the preceding Section, we can then combine 
these relations with the mass function (below) to find the cluster 
distribution as a function of temperature and redshift. The differ- 
ential number density of haloes in a mass interval dAI about M at 
redshift z can be written as 



n(M, z) dM = -F{a) 



Pm{z) da{M,z) 



Ma{M,z) 



dM 



dM , (8) 



where cr{M, z) is the dispersion of the density field at some co- 
moving scale R = (3M/47rpm)^''^ and redshift z, and Pni{z) = 
Pm(z = 0)(1 + z)'^ the matter density. 



4.1.1 Parametrization 

It has been shown bv ljenkins et al.l ( 1200 ih that a good fit (accurate 
to better than 20 per cent) to the mass functions recovered from 
various large A'^-body simulations, in the regime —1.2 < In a^^ < 
1.05, is given by 



F]{a) = 0.315 exp [-| Incr 



+0.61 l^-* 



(9) 



where the halo mass is defined at a mean overdensity of 1 80 with 
respect to the background matter density, independently of the cos- 
mological parameters assumed, or equivalently to a mean overden- 
sity of 180f2m(2) wi th respect to the critical density. Thi s resu lt 



has been confirmed by Evrard et alJ 12002); .Hu&Kravtsovl ( l2003h : 
Klvpin et alj (l2003h: iLirid er & Jenk inTbOOSh: iReed et al J (120031): 



Lokas. Bode & Hoffmaij | |2004) ; iWarren et al.H2006h ; iReed etatl 



1 20071) . and we will therefore use this fit to estimate the expected 
comoving number density of haloes for any given combination of 
cosmological parameters. This also makes the like-for-like com- 
parison with other cluster constraints straightforward, as most rely 
on the Jenkins mass fu nction. The dispersion a is calculated us- 
ing a fit in analogue to I Viana & Li ddle ( 1999), which is accurate 
to within two percent for the range of halo masses relevant for 
this work, compa red to the exact expr ession employing the BBKS 
transfer function teardeen et al.l 19861) . (In a real data analysis, this 
prescription would not be sufficiently accurate, however for fore- 
casting purposes it is acceptable.) 



4.2 Cosmology 

We have already seen that cosmology enters into the prediction of 
cluster numbers as a function of temperature and redshift through 
the selection function, the cluster scaling relations, and the mass 
dispersion. Additionally, the cosmic volume dV/ dz will also enter 
as we need to multiply the differential distribution by this quantity 
(discussed in the following Section). 



Parameter 


Value 


Prior 




0.3 


[0.1,1] 




0.7 




0-8 


0.8 


[0.3,1.3] 




0.044 


0.044 


h 


0.75 


0.75 


Us 


1 


1 



Table 2. Cosmology assumptions used. Fiducial values are given first, fol- 
lowed by priors assumed in parameter estimation. 



4.2.1 Parameters 

We work within the Cold Dark Matter (CDM) paradigm, with 
a spectrum of primordial adiabatic Gaussian density perturba- 
tions. We assume that fim ~ 0.3, JIa = 0.7, as = 0.8, 
Qh = 0.044 and h = 0.75. As we do not expect the XCS to 
have particularly competitive constraining power on flfc, we re- 
strict our analysis to the case of a flat universe, ilfc = 0, in ac- 
cordance with_observatk^ of e.g. the cosmic microwave back- 
ground ('Spergel et al. 2007). We take the present-day shape of 
the matter power spectrum to be well approximated by a CDM 
model with scale-invariant primordial density perturbations whose 
transfer function shape parameter is F « Q^nh x exp[— r2b(l + 
^/2h/0,n-i)] ~ 0.18. This is the mean value ob t ained from differ- 
ent analyses o f SPSS data dSzalav et alj |2003: Pope et al. 2004|; 
iTegmark et aP l2004bl: [Eisenstein et alj 12005. : Blake et al.. .20071: 
Padmanab han et alj|2007h and also per fectly compatible with the 
j-year WMAP data j Spergel et 3112007 1). We have checked that as- 
suming r is either 0.16 or 0.20 does not change our results. (In a 
real data analysis, using the shape parameter is too simplistic, but 
for forecasting purposes it is sufficient.) A summary of our cosmo- 
logical parameter assumptions is given in Table|2] 



4.3 Combining observables and cosmology 

As we have seen, our cluster distribution calculations involve many 
different steps and components. Importantly, they rely on both sim- 
ulation and observational data, as well as direct theoretical input. A 
schematic overview of the relevant inputs, processes and outputs is 
shown in a flowchart form in Fig.|5] Collecting all components, the 
number of clusters in dTdz around (T, z) is given by 



n {M{T, z), z) ^f,^^{L{T, z),T, z)^dTdz 
dl dz 



(10) 



where /sky combines survey area and selection function. This ex- 
pression ignores scatter in the scaling relations and measurement 
errors. A complete treatment is given in Appendix lAll The remain- 
ing component for arriving at parameter constraints is the likeli- 
hood, which is described next. 



4.4 Likelihood 

Turning our attention to using the cluster distribution prediction for 
cosmological constraints and forecasting, we need an expression 
for the likelihood of an observed catalogue of galaxy clusters. The 
likelihood C for a given observed catalogue is simply the product 
of the Poisson probabilities of observing A'^; XCS clusters in the 
bin with widths AT, Az centered at each of the {Ti, Zi) positions. 



^ = n 



AT: 



(11) 
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Figure 5. Flowchart for cluster predictions and forecast parameter estimation. The dash-enclosed area indicates the processes that enter in our calculations. 
Bi-directional dashed arrows are used to indicate the main circular relations, where information from one part is used to inform another, which then informs 
the first, and so on. 



where 

A. = iV(r, - Ar/2, T, + AT/2, z, - Az/2, z, + Az/2) (12) 

is the expected number of XCS clusters in bin i, taking into account 
sky coverage, survey selection function, and any uncertainties in 
scaling relations or measurements (see equations in Appendix lAlt . 
We do not take into account the fact that the positions of galaxy 
clusters are spatially correlated, because the mean distance be- 
tween XCS clusters is typically much larger than the observation- 
ally det ermined correlation length in the range 10 — 20 Mpc 
(seee.g.lNichol etal. 1992; Romer et al. 1994: Collins et al.ll2000l : 
iGonzalez. Zaritsky & Wechsler.200Z ; iBrodwin et al, 2007ft . as a re- 
sult of the XMM pointings being scattered all over the sky. Even if 
the XCS area was contiguous, given the very large depth of the 
XCS, the impact of cluster spatial correlations on the estimation of 
cosmological paramet ers with the XCS galaxy cluster abundance 
data would be sinall jwhitd |2002| : IHu & Kravtsovl l2003l : iHoldeil 
l200dlHu&Cohiill200a) ! 

As we are seeking to obtain expected/typical constraints, in a 
sense a Maximum Likelihood (ML) point estimate, we use A^,; = 
A*, where the asterisk denotes fiducial-model values. Using this 
'average-catalogue' construction, we obtain an excellent estimate 





0.25 0.3 0.35 0.4 



Figure 6. Parameter constraints (95 per cent confidence level) for a set of 10 
random realizations of the catalogue Poisson distribution (dashed coloured 
lines) compared to the average-catalogue parameter constraint (solid black 
line). In the right-hand panel, each contour has been re-centered around its 
distribution mean. A constant L-T relation and no L-T or M-T scatter 
was assumed. 



of the size and shape of the expected likelihood contours, but avoid 
the offset in the best fit away from the fiducial parameter values 
that is associated with, e.g., the most likely catalogue. Any random 
realization of a Poisson sample will exhibit such an offset. Exam- 
ples can be seen in the right-hand panel of Fig.|6] where the results 
for the average-catalogue method is compared to those for random 
catalogue realizations. We wish to avoid offsets of this type as we 
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Parameter 


Description 


No L-T 1 M T scatter 
Constant L-T 


■IPK-r 1 M T scatter 
[Self-similar L T 




L T 1 M T scatter 
Constant L T 


■P / M T scatter 1 

Self-simUai L T | 


Colouring 




Pink 


Green 


Orange 


Blue 


L-T: 












a 


Normalization 


-1.90 
[-1.90] 


-1.92 
[-1.92] 


-1.90 
flat, unrestricted 


-1.92 
flat, unrestricted 


P 


Slope 


2.5 

[2.5] 


2.5 
[2.5] 


2.5 

flat, unrestricted 


2.5 

flat, unrestricted 


Is 


Self-similarity exp. 




[0] 


1/2 
[1/2] 




[0] 


1/2 

[1/2] 


Iz 


Redshift exp. 




[0] 



[0] 




[-1, 1.5] 




r 1 1 rrl 

[-1, 1.5] 


O"log Lx 


Scatter 




[0] 



[0] 


0.3 

[0.2,0.4] 


0.3 
[0.2,0.4] 




Max. scatter in units of CTj^g 


- 




3 

[3] 


3 
[3] 


M-T: 












evolution 






self-similar, normalized to HIFLUGCS 




""logT 


Scatter 




[0] 



[0] 


0.1 

[0.1] 


0.1 
[0.1] 


rriT 


Max. scatter in units of criog t 






3 

[3] 


3 
[3] 



Table 3. Cluster scaling relation assumptions and their labelling. Fiducial values are given first, followed by priors assumed in parameter estimation below 
(usually in brackets). Note that the colour coding at the top of the table is used to indicate these fiducial models throughout. See Sects. IX2l & l3.3l and the 
Appendix for details. 



Quantity 



Labels/assumptions 



Redshift ; 



X-ray temperature T 



Realistic 

cr{^)/(l +s) -0.05 
5% catastrophic (syst.) 


Worst — case 
a(2)/{l +2) -0.10 
10% catastrophic (syst.) 


Realistic 

XsPEC-si inula ted 
XMM-Ne^\ton eiTora 


Worst ~ case 

a{T) = 3 X a(r)B,,n.tic 



Table 4. Summary of measurement en'or assumptions and their labelling. 
See Sects. [3!4l & l3.5l and the Appendix for details. 



are mainly interested in the shape and size of contours, or wish to 
separate possible biases from such an offset. This methodology is 
explained and motivated in detail in Appendix lA2l As stated above, 
Fig.|6]compares constraints derived using this method to constraints 
derived from a Poisson sample of mock catalogues. The results con- 
firm that constraints derived using our methodology provide an ex- 
cellent estimate of the expected constraints. Note that in future real 
data analyses this methodology cannot be used, and there will in 
general be some offset. 

The exploration of the likelihood function in parameter 

space was carried out using a custom code based on standard 

f II — ' — n 

Monte Ca rlo Markov Chain techniques ( Gelman & Rubin 1993; 

Gilks. Richardson & Spiegelhalter 199fc Le;wis & Bridle 20^; 

Verde et aD l2003l : lTtegmaA et al.ll2Ci04al : lDunklev et aljboosi) . The 

calculation of the integrals involved in the likelihood was done with 



the state-of-the-art numeri cal integrati on package s CUBPACK 
dCools & Haegeman j|2003t) and CUBA ( lHahiJl2005l) . 



5 RESULTS 
5.1 Labelling 

Generally, the colouring scheme in Table[3]will be used to indicate 
the fiducial cluster scaling relation model; see Sects. 13.21 and 13.31 
as well as the Appendix. X-ray temperature and redshift errors will 
be indicated as 'realistic' or 'worst-case' according to Table |4l see 
Sects. |?!4l and |3.5| as well as the Appendix. 



5.2 Expected cluster distributions 

5.2. 1 Without measurement errors 

The expected 2D (T, z) distributions of clusters for our four stan- 
dard models are shown in Fig.|7al(underlying distributions), Fig.|7b| 
(expected detections) and Fig. |7c] (detection efficiency), where the 
selection function has been used to go from Fig.|7a]to Fig.|7b] The 
expected redshift distributions and total cluster number counts are 
shown similarly in Fig. [8] Note that as the L-T relation changes, 
so does the expected number of detected clusters, since we are 
more likely to detect a cluster the more luminous it is (and for a 
given temperature, the cluster luminosity increases with redshift 
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(b) Expected detections using selection function. 



(c) Detected fraction per bin. 



Figure 7. Expected cluster number count distributions for ^""XCS, for no L-T nor M-T scatter and no L-T evolution (pink), no L-T nor M-T scatter and 
self-similar L-T evolution (green), L-T and M~T scatter and no L—T evolution (orange), and L-T and M—T scatter and self-similar L—T evolution (blue). 
Bin sizes are Az = 0.05 and AT = 0.5 keV. 



for self-similar L-T evolution). The underlying distribution how- 
ever is of course not dependent on the L-T relation. We find that 
^'^"XCS can be expected to find somewhere in the range of 250- 
700 clusters for its projected area of 500 deg^ and 0.1 < z < 1.0, 
2 keV < T < 8 keV. This corresponds to around 20 per cent of the 
1500-3300 total number of clusters we would expect to detect with 
no photon count cut-off (effectively a ~ 50-photon cut-off). This 
full set of XCS clusters will constitute a significant sample (relative 
to previous studies), representing around a quarter to a third of the 
actual 7000-10000 clusters present in the observed fields. Going to 
higher redshifts, we roughly estimate that a minimum of 250 clus- 
ters will be found at z > 1, of which at least 10 should have > 500 
photons. 

The XCS DRl currently covers an area of 132 deg^, for which 
125 clusters/groups with measured redshifts and > 500 photons 
have been identified from 164 candidate extended sources (with 
> 500 photon counts). No temperature, redshift or other cuts have 
been applied to this set. In the current redshift sample of 125 clus- 
ters with more than 500 photons, approximately 40 per cent have 
temperatures below 2 keV and are therefore classified as groups. 
We therefore expect the final number of genuine T > 2 keV clus- 
ters we detect in this area to be in the range 75-100, depending 
on the fraction of 'extended' sources detected by the survey that 
turn out not to be clusters (blended point sources, etc.). We have 
here assumed that selection effects in the redshift follow-up do not 
significantly bias this number. For our fiducial cosmology, we find 
that the corresponding expected number of clusters is 80-235, the 
range corresponding to the lower and upper limits from our set of 
scaling-relation assumptions. These two ranges are clearly consis- 
tent. The lower predicted limit corresponds to no L-T evolution, 
and hence - all else equal - the observational result might indicate 
a near-constant L-T relation. 

An overview of the expected observational limits of the 



XCS for mass. X-ray temperature. X-ray luminosity and X- 
ray flux (in the [0.1, 2.4] keV band), is given in Fig. [9] We have 
there defined the detection limit, through the selection function, 
as P(detection) > 0.5. These limits are thus the values above 
which we expect to detect, with a photon count of 500 or above, 
at least half of the clusters. It is worth noting that the change in 
detection probability is slow as a function of X-ray temperature, 
and hence the concept of e.g. a single flux limit (which would cor- 
respond to a sharp transition between one and zero in the prob- 
ability) is not suitable for defining the XCS sample. The under- 
lying reason for this is that the XMM archive images occupy a 
range of different exposure times, hence individual flux limits. 
Caution is therefore advised when comparing Fig. |9] to similar 
plots based on a single flux or mass limit. For comparison, using 
P(detection) > 0.05 to define the detection limit leads to a flux 
limit of ~ 5 X 10^^'* erg s~^ cm^^, considerably lower than that 
shown in Fig.|9] 



5.2.2 With measurement errors 

Introducing measurement errors for redshift and X-ray temperature 
will introduce scattering of clusters across the redshift and tem- 
perature cut-offs. As the cluster distribution is not symmetric with 
respect to these cut-offs, there may be a net increase/decrease in 
the expected number of clusters as a result (a type of Malmquist 
bias). Furthermore, the measurement error distributions may also 
be asymmetric, as is our temperature error distribution. Note that 
the relevant 'underlying' cluster distributions for these purposes are 
the expected detections, shown in Fig.lTbl 

The change in the expected total number of clusters as a result 
of different measurement error assumptions are shown in Fig. [TO] 
We find that the effect of measurement errors on the number count 
is significantly less than the effect of intrinsic scaling-relation scat- 
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(a) Underlying cluster distribution. Note that only the M-T relation is rel- 
evant for the underlying distribution, and we therefore colour according to 
both L-T assumptions with the same M-T relation. 
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(b) Expected detections using selection function. 
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Figure 9. Expected observational limits for the ^""XCS (defined as 
P(detection) > 0.5), for the simplest case of no scatter in the cluster 
scaling relations and a constant (pink dashed line) or self-similar (green 
dashed line) L-T relation. Solid lines coiTespond to the hard temperature 
cut2keV < T < 8keV. 
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(c) Detected fraction of clusters per bin. 

Figure 8. Expected cluster distributions for the ^""XCS, for our four dif- 
ferent cluster scaling relation cases. 



Figure 10. Changes in total number of clusters due to our different error 
assumptions, compared to no-eiTors distributions in Fig. lSbl 



ter (cf. Fig. [8b). This is not surprising since the scahng-relation 
scatter is based on the true underlying cluster distribution in Fig.lTal 
a much steeper function than the expected detections in Fig.l7bl 

We also see that only in the case of worst-case temperature er- 
rors is the Malmquist bias significant, and as we shall see later only 
in this case do the measurement errors give a significant bias in 
cosmological constraints, if unaccounted for. For realistic tempera- 
ture errors, a net increase in clusters is seen, as the skewness of the 
temperature error distribution toward low temperatures (Fig. |4) is 
compensated by the somewhat larger number of low-temperature 
clusters scattering up in temperature at the low-temperature end. 
For worst-case temperature errors the temperature is very poorly 
constrained, and this compensatory effect is not sufficient to coun- 



teract the net decrease in number of clusters. Redshift errors tend to 
cause a loss of clusters at the low-redshift end, as the smaller cos- 
mic volume at lower redshifts means more clusters scatter down in 
redshift than scatter up. However, the redshift errors also affect the 
temperature determination, so that low-temperature clusters scat- 
tering up could give a net increase. For realistic redshift errors the 
size of this induced error in temperature is 5 per cent, which is too 
small to have a significant impact. For worst-case redshift errors, 
we see that for the case with no scaling-relation scatter, the induced 
temperature error of 10 per cent reduces the loss of clusters com- 
pared to that for realistic redshift errors. For the case with scaling- 
relation scatter, this effect is not significant, presumably due to the 
much sharper drop in cluster numbers at low redshifts seen for these 
models (Fig.lSbt. leading to the direct redshift error dominating. 
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The fractional change in the number of clusters is very simi- 
lar for the case with scaling-relation scatter as without such scatter. 
Hence, for the case with scatter, the statistical effect will tend to be 
larger since the difference to the A^ideai clusters with no measure- 
ment errors relative to the Poisson error bars, 



+ 5)Arideal) ^/{l + 5) 



(13) 



grows with the number of clusters (and scatter increases the num- 
ber). Here, 5 is the fractional change in the number of clusters. 
Based on this, we estimate that for all the models we consider, an 
upper limit on the fractional change in cluster count for a less than 
la (2(t) bias in constraints is around 4 (8) per cent, which com- 
pares favourably with the results for realistic errors in Fig.[TO] (This 
comparison could be made more rigoro us using the Kolmogorov- 
Smimov test as in' Haiman et al.l ^2001). but this treatment is suffi- 
cient for our purposes.) Due to computational limitations we have 
not calculated the change in number count for the case with scaling- 
relation scatter and both types of measurement errors, but based on 
the results obtained would expect them to be very similar (in frac- 
tional terms) to the results for the no-scatter case. 



5.3 Constraints: without measurement errors 

5.3.1 Known scaling relations, no scatter 

For both choices of L~T relation (constant and self-similar), the 
expected constraints are shown in Fig.[nil We expect ^°°XCS to 
measure fim = 0.3 ± 0.02, erg = 0.8 ± 0.02 in each case. The 
crg-nm degeneracy differs somewhat between the two L-T cases, 
for a constant L-T approximately given by 

^8=0.8^7^ , (14) 



and for a self-similar L-T by 



(15) 



These degeneracies are somewhat different from previous studies. 



e.g. (78 oc fin 



in Vmna & Liddle ( 1999). That study however 



used only the total number of clusters above a certain tempera- 
ture threshold to arrive at constraints. The orientation also depends 



500 photons (solid) 
detections (dotted). Qgg 




500 photons (solid) 
All detections (dotted) 




Figure 12. Comparison of ^OOXCS to the case where all detections are used. 
A constant (left) or self-similar (right) L-T relation, and no L-T or AI-T 
scatter was assumed. Contours correspond to 68 and 95 per cent confidence 
levels. 



on redshift depth jLevine et alj|2003) . These con straints are bet- 
ter than what has been forecast for XMM-LSS jRefregier et al.l 
2002), but the comparison is not entirely like-for-like as they em- 
ploy the Press-Schechter mass function and assume a rather dif- 
ferent fiducial as and F. The constraints are also fairly competi- 
tive with what can be expected from other su rveys using e.g. the 
South Pole Telescope (SPT), Planck or DUET (Majumdar & Mohi| 
|2004|;lGeisbusc h & Ho bson 2007), but in making this comparison 
one should note that we employ much more restrictive priors; the 
set of free parameters is not exactly the same. 

The constraints in Fig. ll lal are for a photon-count threshold of 
500. Lowering the photon-count threshold so that more clusters are 
included in the sample should clearly affect the size of constraints. 
We find that using all detections (corresponding to an effective 
photon-count threshold of typically ~ 50 photons) improves ID 
constraints by about 40 per cent (Fig. [T2t . This corresponds to 
an increase in the number of clusters used of around 1200-1700 
(400-500 per cent). For clusters with fe w photon counts th e tem- 
perature errors will become very large fciddle et al.ll200lh . Con- 
tamination from e.g. galaxy groups will also rise sharply with de- 
creasing photon-count threshold, partly because clusters with low 
photon count will tend to have a low temperature. Hence, these 
estimates provide only upper limits on the possible constraint im- 
provement. Taking error and contamination effects into account, it 
is likely that there would be only a weak improvement by includ- 
ing those XCS clusters expected to have a photon count below 500. 
However, follow-up observations with e.g. XMM or IXO (formerly 
XEUS) could improve the photon statistics of those clusters enough 
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L-T evolution 



Constant 



Self-similar 



Known scaling relations, no scatter Self-calibration of L-T, with scatter 



o-log Lx 
a 

P 
7z 



o-log Lx 

a 



0.30 ±0.02 
0.80 ±0.02 



0.30 ±0.02 
0.80 ±0.02 



0.30 ±0.03 
0.80 ± 0.05 

[0.2,0.4] 
-1.91 ±0.12 
2.50 ±0.33 

[-1,1.5] 



0.30 ±0.03 
0.80 ± 0.04 

[0.2,0.4] 
-1.92 ±0.12 
2.55 ± 0.31 

[-1,1.5] 



Table 5. Expected la parameter constraints for ''" XCS when marginalized over all other parameters, without measurement en'ors. 



to make their inclusion in the analysis worthwhile. We discuss this 
in more detail in Sect. |6] 



ate data is different in principle, we have checked that a power law 
can approximate its redshift evolution very well. 



5.3.2 Self-calibration of L-T relation, with scatter 

Self-calibration is the process by which e.g. the L-T relation can 
be constrained jointly with cosmol ogical parameters using only 
the (T,z) clus ter numb er counts (Iffi]l2003l : iLima & hJ I2OO4I : 
iMaiumdar feM ohr 2004: Lima & Hull2005l) . 

We find that jointly fitting for the cosmological parameters and 
the L-T relation, ^""XCS will measure ilm = 0.3 ± 0.03, aa = 
0.8 ± 0.05 under our assumptions. The marginalized f^m-fs like- 
lihood distributions are shown in Fig. lllbl and the full set of like- 
lihood distributions in Fig. [T3] (note that Fig. Illbl is just the top 
triangles of these plots). The ID parameter constraints are listed 
in Table [5] The constraints for the case of self-similar L-T evolu- 
tion appear narrower than for a constant L-T. This is due to the 
redshift-evolution prior, explained below, significantly cutting the 
distribution. We thus believe the constant L-T case to be most 
representative of the constraints we can expect. As expected, the 
constraints on Qm and erg degrade when marginalizing over the 
four L-T parameters (compared to Fig. Illab , but still remain rel- 
atively small. In comparison to the South Pole Telescope, Planck, 
andZ)(/£T(Majumdar & Mohr 2004; Geisbusch & Hobson 200%, 
our constraints are still competitive (we lack comparable results for 
XMM~hSS, but expect to do better given our larger survey area and 
depth). However, if we were to consider self-calibration of the M- 
T relation as well (rather than using an external description, as de- 
scribed in Sect. 13. 2t . those surveys would have more power than the 
XCS (using only arc hival XMM data) through the use of the clus - 
ter power spectrum dMaiumdar & Mohrll2004l : iLima & Hij|2004h . 
In fact, we do not expect XCS to have any significant constrain- 
ing power if the M-T relation is self-calibrated as well. We show 
examples of the effects of M-T systemati cs in Sect. 15.51 It has 
been shown (e.g. IMaiumdar & Moliil |2004|) that small follow-up 
samples can dramatically improve the situation. Therefore, weak- 
lensing/SZ follow-up and/or a cont iguous e.g . XMM survey would 
be h ighly advantageous (see also iBerge et all I2OO8I : IPierre et^ 
l2008i) . Comparing to Fig. ll la| although we lose constraining power 
due to an increase in the number of parameters, since we are includ- 
ing scaling-relation scatter the number of clusters increases signif- 
icantly which mitigates the degradation. Note that, as shown in Ta- 
ble[3] we fit the data to a power-law L-T relation ~ (1 + z)''~ . 
Although the functional form for a self-similar L-T used to gener- 



Using (T, z) number-count self-calibration, based only on 
archival XMM data (Fig. 1131 ), we can constrain the L-T normaliza- 
tion a to ±0.12 (or ±6 per cent) and the L-T slope /9 to ~ ±0.3 
(or ±13 per cent). The self-calibration procedure is not able to 
jointly constrain the scatter aiog Lx and redshift evolution 7^ sig- 
nificantly. We have therefore imposed flat priors on these param- 
eters, 0.2 < cTiogLx ^ 0.4 and —1 < 7z < 1-5 to limit the 
distribution within reasonable bounds of a size reflecting the min- 
imum accuracy to which we would hope to measure these param- 
eters from our direct L-T data, i.e. also taking into account the 
measured X-ray flux (see also TableO. 

Thus, the self-calibration power to constrain the L-T rela- 
tion is present in the data, but as can be seen in Fig. [T3] there 
are strong degeneracies between parameters. The main degener- 
acy is that between 7^ and o-\ogL^', increasing criogLx can eas- 
ily be offset by reducing 7^, which also is easy to understand 
physically as they both effectively scale the cluster luminosities 
up or do wn, and corresponds t o the observation by several au- 
thors (e.g.LB ranchesi et alj|2007l : lMaughanll2007l : iNord et al.ll2008l : 
IPacaud et a l. 2007) that L-T scatter can mimic L-T evolution 
(also discussed in Sect. l3.3.Tt . The redshift evolution 7^ is also de- 
generate with the L-T slope /3, which is thus itself degenerate with 
fiogix. Interestingly though, the cosmological parameters show 
little degeneracy with o-\o^l^. It is the result of these degenera- 
cies that all four L-T pararneters^ carinot be jointly constrained. 
Bayesian Complexity l lKunz, Trotta & ParkinsonI I2006h suggests 
that at most five parameters (including fini and ag) can be fully 
constrained, which is also what we find in practice. As one might 
expect, we will therefore have to rely on our direct L-T measure- 
ment to constrai n the L-T scatter and evolution (as proposed by 
Verde, Haiman & Spergelll2002l; lHj|2003l : iBattve & Welleill2003l : 
I Wang et alJl2004lLima & hJ|2005|) . 



The fact that our relatively generous priors on the L-T scat- 
ter and evolution still restricts the distribution, affecting the size 
of cosmological constraints, also serves to illustrate a slightly dif- 
ferent point of view: turning the problem around, and using com- 
plementary cosmological data to constrain e.g. ilm and erg, thereby 
possibly also improving constraints on astrophysical parameters (as 
noted by e.g. lLevine et alJl2002l : lHu & Kravtsovll2003l : lHi3l2003h . 



© 0000 RAS, MNRAS 000, 000-000 



16 M. Sahlen et al. 




0.2 0.3 0.4 



0.9 
c" 0.8 
0.7 




0.2 0.3 0.4 0.7 0.8 0.9 
0.4 I /I ri 0.4 r 



§■ 0.3 
0.2 

I 

-1.5 
S -2 
-2.5 

3 
2 

1 


-1 



m 



0.3 
0.2 



0.2 0.3 0.4 0.7 0.8 0.9 0.2 0.3 0.4 
1.5 I 1 -1.5 



^ -2(0^ 



-2.5 



-2.5 





0.2 0.3 0.4 0.7 0.8 0.9 0.2 0.3 0.4 -2.5 -2 -1.5 
2^S^^ 2 




0.2 0.3 0.4 0.7 0.8 0.9 0.2 0.3 0.4 -2.5 -2 -1.5 2 3 



mm 




mm 



0.2 0.3 0.4 0.7 0.8 0.9 0.2 0.3 0.4 -2.5 -2 -1.5 2 3 



(a) Constant L-T relation 



-1 1 
7, 



0.2 0.3 0.4 

a8 

0.7 1 ^ 

0.2 0.3 0.4 0.7 0.8 0.9 

0.4 




0.3 
0.2 



mm 




0.2 

0.2 0.3 0.4 0.7 0.8 0.9 0.2 0.3 0.4 



-1.5 
s -2 
-2.5 



-1.5 



-2.5 



-1.5 
-2 
-2.5 




0.2 0.3 0.4 0.70.80.9 0.2 0.3 0.4 -2.5 -2 -1.5 
3 1 I sLZ^^^S 3 

2Lh_- — -\ 2 



0.2 0.3 0.4 -2.5 -2 -1.5 




1 1 — I -1 1 — I -1 1 — 3 -1 

0.2 0.3 0.4 0.7 0.8 0.9 0.2 0.3 0.4 -2.5 -2 -1.5 
€1 o„ a, , a 





(b) Self-similar L-T relation. (As can be surmised from some of the ID distributions, the marginalized and mean likelihoods 
approach each other very slowly in the MCMC due to the prior cutting the distribution, however the statistical properties of 
the distribution have converged appropriately.) 

Figure 13. Expected 68 and 95 per cent parameter constraints for ^""XCS, with scaling-relation scatter and no measurement errors, and fitting jointly with L- 
T relation (self-calibration) for which reasonable priors on scatter and redshift evolution have been adopted. Solid lines correspond to marginalized likelihood, 
dotted lines and shading to mean likelihood. Stars denote the fiducial model assumed. 
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Figure 14. Expected 68 and 95 per cent parameter constraints for ^""^XCS, for known scaling relations, no scatter, and with measurement errors. Stars denote 
the fiducial model assumed. 
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Table 6. Expected la parameter constraints for ^""XCS when marginalized over the other parameter, for known scaling relations, no scatter, and with 
accounted-for measurement errors. 



5.4 Constraints: with measurement errors 

5.4.1 Known scaling relations, no scatter 

The effect on derived cosmological constraints from measurement 
errors in X-ray temperature and redshift is small. Taking into ac- 
count knowledge of the error distributions in the data analysis, we 
find that the size of uncertainties increases somewhat compared to 
the no-errors case (see Fig. |14a| and Table |6] of. Fig. I Hal and col- 
umn 1 in Table[5)- Interestingly, even with temperature or redshift 
errors of an unrealistically large magnitude, the effect on the con- 
straints is small. As such, we expect the broadening of constraints 
due to measurement errors to be a minor effect compared to the 
effects of possible systematic errors. These findings are in agree- 
ment witlrwhaU^ been found by e.g. Huterer et al. ( 2004, 
l2006h : lLima&Hijj2007l) . 

The effect of ignoring temperature and redshift errors in the 
fitting procedure can to some extent model one such systematic; 
poor knowledge of the measurement error distributions. As can be 
seen in Fig. |14b| we find that when ignoring measurement errors 
in the fitting, for all combinations of single measurement eiTors 
(i.e. only z or T at a time), the difference in cosmological con- 



straints compared to the fiducial model is within 2a (and most 
are within Icr). For combined z and T measurement errors, the 
same is still true for realistic errors, but for a self-similar L-T and 
worst-case errors the bias is larger than 2cr (see Fig. I14cb . These 
results agree well with the expectations presented in Sect. 15.2.21 
and thus suggest that a good estimate of the bias in cosmologi- 
cal constraints due to Malmquist-bias effects can be obtained by 
comparing the net Malmquist bias to the Poisson error of the total 
cluster number count (at least to roughly discriminate > 2cr bias 
from < 2(7 bias). This is not that surprising as the shape of the 
cluster distribution does not differ much between such models, and 
thus the total nu mber count carries most of the information (also 
noted in lHaimaii e t al. 2001). The ID constraints corresponding to 
Figs.[T4b]&[T4c]are listed in Table|71 



5.4.2 Self-calibration of L-T relation, with scatter 

Because of computational limitations we have not explicitly calcu- 
lated cosmological constraints for self-calibration with measure- 
ment errors. We have however checked that when scatter is in- 
cluded in the data, the effect of temperature and redshift errors on 
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Table 7. Expected la parameter constraints for ^" XCS when marginalized over the other parameter, for known scaling relations, no scatter, and with 
unaccounted-for measurement errors. 
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Table 8. Expected Icr parameter constraints for ^''''XCS when marginalized over the other parameter, for systematic errors in the scaling-relation assumptions 
(Fig. list . The different scenarios are numbered according to the order in which they are listed in Fig. 1151 
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Figure 15. Expected 68 and 95 per cent parameter constraints from the 
^""XCS, for various cluster scaling relation assumptions inconsistent with 
the fiducial model used for generating the data. The different data and fit- 
ting assumptions are colour coordinated with the contours, and listed in the 
panel above the plot. The model parameters are the same as previously, and 
listed in Table |3] The corresponding cluster distributions in redshift and 
temperature can be found in Fig. |7b| 



the expected cluster distribution is very similar to the case where no 
scatter is included, see Fig.fTO] and the discussion in the preceding 
Section and Sect. 15.2.21 We thus expect that the effect from mea- 
surement errors on constraints where scatter is included, with or 
without self-calibration, can be expected to be small or negligible - 
both in terms of bias if the errors are ignored, or broadening of error 
contours when errors are taken into account. We therefore believe 
that the self-calibration results for the case without measurement 
errors (Figs. [TTbl &|13l Table |5j should provide a good rough ap- 
proximation of the expected self-calibration constraints with mea- 
surement errors. Note that this situation is bound to change once 
direct L-T data is added to the procedure, as the temperature er- 
rors will then have a significant impact on the accuracy to which 
the evolution of the L-T relation can be determined, hence setting 
the size of the constraints on (JiogL^ and 72. One can therefore 
not conclude that temperature errors are largely unimportant for 



the cosmological constraints we will ultimately produce from the 
data, but an upper limit on the size is set by this work (see e.g. 
IVerde et alJl2002l : lHi]|2003l ; lBattve & Welleill2003l) . 

5.5 Constraints: systematic biases 

It is clear from the above sections that measurement errors in the 
guises we consider are not expected to be a major source of bias or 
degradation of constraints vis-a-vis the underlying cluster distribu- 
tion. However, if incorrect assumptions as to the characteristics of 
the M -T and L-T relations are used when fitting the data, signifi- 
cant bias may occur, as seen in Figs.[T5]&[T6]and Tables[8]&|9] 

Looking first at Fig.[T5](and Table[8]for the ID marginalized 
constraints), the figure shows how both the size and best-fitting val- 
ues of cosmological constraints are affected when ignoring scatter 
in the scaling relations, using an inappropriate L-T relation, or 
both. The first case (from left in the panel above the plot) shows 
how using a self-similar L-T to fit data coming from a constant 
L-T leads to an overestimation of fim. Comparing the second, 
third and fourth cases, we can see that L-T evolution and scaling- 
relation scatter all have a similar effect when unaccounted for, all 
leading to an overestimation of erg (and consequently underestima- 
tion of ilm). As they both have a similar effect, the self-similar evo- 
lution in the fifth case can mimic some of the unaccounted-for scat- 
ter, leading to a lesser overestimation than for the previous cases. 
On the other hand, the sixth and last case combines the two effects 
thus leading to a dramatic overestimation of erg . As such, this last 
case represents a worst-case scenario for this type of bias. 

The other figure. Fig. 1161 (and Table |9] for the ID marginal- 
ized constraints), shows how constraint size and best-fitting values 
vary with systematic errors in the M-T relation only. The first two 
cases (from left in the panel above the plot) illustrates significantly 
underestimating a scatter of 10 or 15 p er cent (deviatio ns similar to 
what might be expected according to IVikhliniiJl200t ). This leads 
to an overestimation of erg, and relatively narrow constraints, since 
scatter significantly increases the number of detected clusters. The 
largest impact seen in this figure comes from poor knowledge of 
the redshift evolution of the M-T relation, seen in the second pair 
of contours. We consider a self-similar M-T analyzed as constant 
in redshift, and a constant M-T analyzed as self-similar. In both 
cases the deviation from the fiducial model is very significant, with 
the size of constraints also affected, due to the fiducial-model as- 
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Table 9. Expected la parameter constraints for ^""XCS when mai'ginalized over the other parameter, for systematic en'ors in the M—T relation assumptions 
(Fig. 116) . The different scenarios are numbered according to the order in which they are listed in Fig. 1 161 
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Figure 16. Expected 68 and 95 per cent parameter constraints from the 
^""XCS, for various mass-temperature-relation assumptions inconsistent 
with the fiducial model used for generating the data. The different data and 
fitting assumptions are colour coordinated with the contours, and listed in 
the panel above the plot. Throughout, a self-similar L-T relation with scat- 
ter (as specified in Table[3) has been assumed. 



sumptions having a significant impact on the number of detected 
clusters. The third, and last, pair of contours show the effect of 
over- or underestimating the normalization mass by 40% (this value 
agrees with what might be expected according to e.g. Vikhlinirj 
I2OO6I) . Overestimation of the mass leads to an overestimation of 
as, since the higher the assumed mass for a given temperature, the 
fewer the number of clusters at that temperature. Underestimation 
of the mass consequently also leads to an underestimation of erg. 

In most cases, the constraints are more than Sa away from the 
fiducial model. Referring back to the discussion on Poisson errors 
in Sect. l5.2.2l and applying that to the relevant cluster distributions 
(see Fig. [Sbl l, this result is not surprising. We find that in terms 
of total number count Poisson error bars, the discrepancy between 
data and fitting assumptions are at least ~ 6a. These limitations 
will apply to any galaxy cluster survey employing cluster scaling 
relations to arrive at results, certainly all X-ray surveys, with the 
exact susceptibility to bias given by the combination of true cluster 
distribution and survey selection function. This stresses the impor- 
tance of knowledge of the behaviour of the scaling relations in the 
form of self-calibration and/or separate follow-up information. For 
this, accurate knowledge of the selection function is necessary, so 
that scaling-relation scatter a nd evolution can be correctly distin- 
guished (as pointed out in e.g. lPacaud et alj|2007h . 



6 DISCUSSION AND CONCLUSIONS 
6.1 The XCS forecast 

The XMM Cluster Survey (XCS) will cover 500 deg^ and is ex- 
pected to produce one of the largest catalogues of galaxy clusters 
so far, with ~ 1500-3300 clusters having 0.1 < z < 1, 2 keV < 
T < 8 keV. Around 20 per cent of these will belong to the ^"°XCS 
sample that have sufficient photons (> 500) for their X-ray tem- 
perature to be reliably estimated. In a rough approximation, we 
expect to find an additional 250 or more clusters at 2 > 1, of 
which at least 10 should have > 500 photons. We have proven 
the potential of the XCS with the recent discovery of the most dis- 
tant galaxy cluster known, XMMXCS J 2215.9-1738 at z = 1.457 
jStanford et .^120061 : Imton et al.112007 '). Cluster redshifts are ob- 
tained from b oth public-domain photometry and the NOAO-XCS 
Survey (NXS. lMiller et alj2006h . To date, more than 400 XCS can- 
didates have been optically confirmed. An initial observational area 
of 132 deg^ (XCS DRl) contains in the range 75-100 detected 
clusters with T > 2 keV and > 500 photons. This number is con- 
sistent with the theoretical expectations presented here for our fidu- 
cial models. 

We have shown the power in determining both cosmologi- 
cal and astrophysical parameters expected from the XMM archive, 
using only self-calibration from the (T, z) distribution and tak- 
ing detailed selection function, cluster distribution and measure- 
ment error modeling into account in a Monte Carlo Markov Chain 
(MCMC) setting. Inclusion of the selection function requires the 
specification of the luminosity-temperature relation, and thus en- 
ables us to also self-calibrate this relation. We also introduce and 
motivate a new 'smoothed Maximum Likelihood estimate' of the 
expected constraints, which can be regarded as intermediate be- 
tween a Fisher matrix analysis and a full mock catalogue ensemble 
averaging in MCMC. 

We expect the ^""XCS to measure 
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for a flat ACDM Universe, the uncertainty on Jim also being that 
on Q,A- The cosmological constraints are similar to th ose already 
obtained using gas mass fraction measurements (e.g. lAllen et all 
[2002, 2008) . They are better than those that can be expected from 
XMM~LSS jRefregier et ZII2OO j) . because XCS covers more area 
than XMM-LSS (predicted maximum area of 64 deg^, but so far 
results for only 5 deg'^ have been published) and has a higher av- 
erage exposure time. Our constraints are also somewhat competi- 
tive compar ed to expected constrain t s from e.g. the SPT, Planck , 
and DUET dMaiumdar & Mohj|2004lGeisbusch & Hobsorj|2007h . 
except if self-calibration of the mass-temperature relation is also 
considered. The scatter and redshift evolution of the luminosity- 
temperature relation cannot be jointly constrained to a significant 
degree by the self-calibration data alone; additional data - archival 
XMM and/or follow-up - is needed to distinguish e.g. no evolution 
from self-similar evolution if the scatter is left as a free parameter. 
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Like e.g. iLevine et al.l i2002h : IHu & Kravtsovl ( |2003[) : IhuI ( l2003h . 
we note that there is also potential to use this conversely, to let 
complementary cosmological data help constrain astrophysical pa- 
rameters. We may return to this in future work. 

6.2 Measurement errors 

We include for the first time realistic temperature measurement 
errors, based on detailed XSPEC simulations of the XMM fields, 
and propagate redshift errors to the temperature determination. The 
presence of realistic or worst-case measurement errors in X-ray 
temperature and redshift will have only a small impact on the ac- 
curacy to which cosmological parameters can be expected to be 
measured, of order 0.01 in ID confidence limits. Furthermore, we 
find that imperfect knowledge of the variances of measurement er- 
rors, or the presence of catastrophic photometric redshifts, should 
not produce significant bias in the cosmological constraints. We 
conclude that, under these assumptions, even ignoring the expected 
realistic measurement errors in the data analysis will provide a 
reasonable estimate of the true constraints. For the case where di- 
rect L-T data is included in the analysis, the impact of measure- 
ment errors (including susceptibility to systematics) will be larger 
jVerde et al.ll2002l : IhuI |20Q3'- Battv e & WelleJ l2003V The size of 
constraints forecast here pr ovide an upper limit for that scenario. 

^It is already known ( iHuterer et al] |2004 l2006l : iLima & Hul 

|2007|) that irreducible systematic errors in redshift estimation is a 
potential problem for cluster surveys, but we leave for future work 
the specific requirements for the XCS. 

We do not yet take into account the variation of photon count 
with temperature/luminosity, and how that affects the size of tem- 
perature errors. Including this effect, instead of employing a lower 
threshold only, may well improve the size of our constraints. How- 
ever the maximum improvement for self-calibration is small. For 
inclusion of direct L~T data the importance will be larger. 

6.3 Cluster scaling relations 

The choice of L-T relation itself has no significant impact on the 
size of cosmological constraints. In our considerations, we do not 
yet take into account the separate L-T measurement to be per- 
formed by the XCS. In the final data analysis, the L-T measure- 
ments will be jointly fitted with the cluster distribution. Hence, our 
expected constraints represent a worst-case scenario of no direct 
data on the L-T relation. We plan to revisit the issue of the XCS 
L-T measure ment in the future. As an e xample, estimates for the 
DUET survey jMaiumdar & Mohill2004t) show that follow-up in- 
formation on the M-T relation can improve constraints by more 
than a factor of three. 

We quantitatively show that making incorrect assumptions 
about the cluster scaling relations can typically result in at least a 
2cr-3o- bias in cosmological constraints, a result which can be con- 
sidered generic for all X-ray and SZ cluster surveys, and those opti- 
cal surveys relying on cluster scaling relations. Thus, parametrizing 
the scaling relations appropriately and using self-calibration and/or 
follow-up information is crucial to arrive at robust results. This 
places high demands on precise characterization of the survey se- 
lection function to accurately distinguish scaling-relation evolution 
and scatter. That is not a problem for X-ray cluster surveys (as they 
generally have the best-understood selection functions), and shows 
the importance of the XCS measurement of the L-T relation for 
cosmological applications. The XMM-LSS collabor ation have al- 
ready pointed this out, and obtained some first results l lPacaud et al.l 



l2007h . A potential pitfall however is the possible redshif t evolution 
of the L-T scatter, as observed in the CLEF simulation jKav et al] 
l2007h . This has not so far been considered in the literature, but is a 
possible sour ce of bias that should be b etter understood. The future 
IXO mission (Bleeker & Mendez 2002) will be of great importance 
for precision measurements of all details of the L-T relation. The 
XCS will provide thousands of clusters for IXO to target. 

An important source of uncertainty is the mass-temperature 
relation. We have shown quantitatively that, as for the luminosity- 
temperature relation, imperfect knowledge can easily lead to sig- 
nificant bias. Joint estimation of the mass-temperature relation 
will lead to broader constraints, and we do not expect the XCS 
to be able to constrain both the L-T and M-T relations simul- 
taneously. Generally, it has been found that an accuracy of less 
than 10 per cent in the M-T relation will be needed, and that 
self-calibration (particularly if making use of the power spectrum, 
whic h XCS can not do) and/or small follow-up samples can achieve 
that (Holder et al.' '200 1'; 'Haiman et al.' '200 1'; 'Levine et al.' '200^ 
Majumdar & Mohr 2003, 2004; Wang et al. 2004; Lima & ly 
20041; |L ma & Hull2005l) . A recent development is the claim that 
the X-ray lumino sity is a better mass proxy than previously thought 
( lMaughanll2007l) . This remains somewhat controversial, but could 
be worthwhile to consider. Its potentially low scatter and the 
prospect of including low-temperature clusters, for which the tem- 
perature cannot be accurately measured, makes this interesting . 
Likewise, employing the quantity Ix (e.g. lKravtsov~et ai] |2006l) 
would also be an interesting option to consider. We leave the XCS- 
specific details for future work. 

It has a lso been noted by, a mong st others, lYounger et al.l 
(2006) and lAscasibar & Diegol ( l200a) . that the choice of 
parametrization for the cluster scaling relations can have a signif- 
icant impact on the size of cosmological constraints, and they ar- 
gu e that a p hysically-motivated form is beneficial. As also noted 
bv lLima & Hu (2007), e fforts in cor r elatin g physical properties of 
clusters, such as that of IShaw et alj ( |2006|) . could therefore be of 
great importance for the size of cosmological constraints, not just 
biases or astrophysics. However, as the observed dependence on 
parametrization appears to largely come from an Om-f2A degener- 
acy, and in this work we assume that Q.a = 1 — f2m, we do not 
expect this to be of importance for our results here. 

6.4 Selection function 

The cluster model assumptions made in our calculation of the se- 
lection function could have an impact on cluster detectability and 
thus cosmological constraints. For this reason, we have studied the 
selection function dependence on cluster structure parameters (for 
the beta model assumed). For core radii within reasonable bounds, 
the relative difference compared to our standard beta model is of 
the order 10 per cent. This number is however an overestimation to 
the resulting overall uncertainty in cluster number predictions, as it 
does not include any detailed knowledge of the cluster population, 
in particular the detailed distribution of cluster-model parameters. 
Therefore, the significance is limited. We expect the re al differ- 
ences to be smaller. These results agree with those of Bure nin et al.l 
| |2007|) for the 400d survey, which show that reasonable variations 
in cluster size, morphology and scaling relations induce an uncer- 
tainty in the detectability for a given flux of typically less than 5 per 
cent. In a future, real analysis, the cluster-model parameter distribu- 
tion would be included along with a selection function dependence 
on these parameters, thereby also significantly reducing any un- 
certainty. We are currently studying the selection function further 
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in this respect. Among other things we are ap plying the detec tion 
pipeline to the simulated clusters from CLEF jKav et aill 20071) . to 
understand the impact of various cluster properties (Hosmer et al., 
in preparation). 

6.5 Sensitivity to fiducial model and priors 

Dropping our assumption of spatial flatness, we do not expect par- 
ti cularly strong cons t raining power on Qa, base d on results such 
as I Allen et all i2002l) : lMaiumdar & Moh3 j2004) . One should note 
though that the utilization of X-ray galaxy clusters over a large 
range of re dshifts has additional constraining power compared to 
lAllen et alj ( |2002|) . as those results are based only on the gas mass 
fraction in nearby cluster s; most of the constraining power o n 
Qa comes from z > 0.5 ( [Holder et al]|200ll : iLevine et alj|2002h . 
The constraints on fim and erg should broaden as a consequence 
of dro pping the flatness assumption; however Majumdar & Mohr 
( |2004|) find that such increase tends to be fairly marginal. This in- 
crease could arguably also be alleviated by employing appropri- 
ate parametrizations for the cluster scaling relations, as discussed 
above in Sect. 16.31 This, as well as the constraining power on 
modified-gravity models, is a topic for further investigation. 

The assumption of fixed values for the priors of some cosmo- 
logical parameters, e.g. the scalar spectral index and the Hubble 
constant, is not realistic given the uncertainty that still exists re- 
garding their true values. Relaxing those priors would increase the 
size of constraints, though probably not in a significant manner. 



6.6 Requirements for a real analysis 

Our methodology contains a number of simplifications that are suf- 
ficient for a forecast, but for a future real analysis will need to 
be modified. In particular, this applies to the calculation of the 
mat ter-field dispersion a, which ought to be calculated emplo ying 
e.g.lEisenstein&Hul 11998) or CMBFAST (Seliak & Zaldarriagj 
Ii99ar for the transfer function. The uncertainty in the selection 
function will have to be better understood and included explicitly. 
In this context, understanding the distribution and impact of cluster 
structure parameters and their correlation is important. New mass 
function fits which are more accurate than Ijenkins et al.1 j200ll) 
should also be used, and the impact of uncertainty included. Un- 
certainty in the M-T normalization chosen will have to be more 
carefully considered, as will possible systematic uncertainties in 
temperature definitions and in photometric redshifts. We will in the 
final analysis fit the L-T relation jointly with cosmology (i.e. use 
flux data as well), thereby achieving some improvement on the con- 
straints presented here. 



6.7 Tiie role of XCS, future surveys and outlook 

The XCS is a complete investigation into the galaxy cluster content 
of the XMM archive. It will be a pathfinder survey for the many 
ongoing and planned galaxy cluster surveys, and in particular help 
guide design of new XMM observations and the upcoming X-ray 
missions eROSITA and IXO. The L-T measurement from the XCS 
will be the most accurate so far, and will provide an important cal- 
ibration for potentially all cluster surveys. Although the expected 
cosmological constraints may not be particularly tight compared to 
those from the combination of other cosmological data, they will 
provide important complementary, independent tests of the stan- 
dard cosmological model from the largest X-ray cluster sample. 



Particularly, we expect to provide good independent constraints on 

(78 . 

For example, re-imaging of the XCS sample with XMM and/or 
IXO could improve temperature errors sufficiently, so that at least 
some of the remaining ~ 80 per cent of the XCS clusters not 
in the ''""XCS could be used for constraints (this corresponds to 
no photon-count cut-off in our calculations - effectively a ~ 50- 
photon cut-off). We find that an upper limit on the improvement in 
constraints is by la (2D) or ~ 40 per cent (ID), which thus also 
represents the best one could possibly do with the current XMM 
archive using [T, z) self-calibration only. Once we add direct L- 
T data to the procedure, the lever arm from re-im aging will be 
more significant. The XMM-LSS collaboration argue jPacaud et al.l 
2007) that the most efficient way to constrain the L-T evolution is 
to increase the sample size, rather than improve temperature er- 
rors, and propose a future 200 deg^ XMM survey with this ratio- 
nale jPierre et alj|2008l) . A complementary approach to additional 
observations would be to also use the luminosity-mass relation as 
mass proxy for those clusters fo r which the temperature determina- 
tion is difficult ( lMaughanll2007h, or use the relatively new quantity 
Yx advocated by e.g. lKravtsov et al.l ( |2006|) . 

The future for galaxy clusters as a precision and complemen- 
tary cosmological tool looks increasingly promising, with a range 
of surveys planned or underway, and numerous simulations under- 
taken to understand the mass function and cluster physics. The XCS 
will produce one of the largest ever catalogues of galaxy clusters, 
providing valuable information on cosmology and cluster physics 
through the luminosity-temperature relation, beating a path for 
the many planned surveys. The interface between well-understood 
cluster physics and cosmology, cross-calibration, and complemen- 
tary cosmological data will surely be important for constraining 
dark energy, the primordial power spectrum, and cluster physics 
over the next decade and beyond. 
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APPENDIX A: EQUATIONS 

Al Cluster counts 

Al.l Ideal measurements 

The expected number of clusters with temperatures between Ti and T2 at redshifts between zi and Z2 when measurements are assumed to 
be exact is given by 



Nidc3.l{Tl,T2,Zl,Z2) = 



^2 /-J 2 

/ Wideal(r, 

1 



z)ATdz 



(Al) 



where riideai is the actual number density of clusters in temperature and redshift, given by the convolution of the mass function n [Mt, z) 
with cluster scaling relations, their scatter (through p (Lt , Mt)), cosmic volume dV/dz and the survey selection function /sky (including sky 
coverage): 



nidcai(T,^) ^ f f niMt,z)Uy{Lt,T,z)p{Lt,Mt\L{T,z),M{T,z))^dLtdMt. 

J Aft JLt 

The scaling-relation scatter probability distributions are assumed to be statistically independent, 

p(Lt,Mt\L{T,z),M{T,z)) = p{Lt\L{T,z)) y. p{Mt\M{T,z)) , 
each having a log-normal form: 



(A2) 



(A3) 



p{Mt\M{T,z),T,z)dMt = p{T^\Mt)\T,z)^dT^' 



xe (^mTO-iogT - 1 logio T - logio Tt 



erf (mT/%/2) \/27r(Jiog T 
Af,\ dMt 



exp 



Af\2 



l(logior-logio7;*^) 



'logT 



10-dloglo^t*^ 



p{Lt\L{T,z))dLt 



erf (rriL / \/2) \/27rcriog Lx 



exp 



1 (logio i^(r,^)-iogio/^t)' 

2 



xe(mi,aiogf,x - I logio-f'(r,2) - logioitl)dlogioLt • 
The parameters niT, rriL, (JiogT and aiogL are described further in Sections [3. 21 and [33] as well as Table|3] 



(A4) 



(A5) 



A1.2 Measurement errors 

When treating the case of measurement errors in T and z, we must distinguish observed and true temperature. The expected number of 
clusters between obser-ved temperatures Ti and T2, and redshifts zi and Z2, is given by 

Nobs{Ti,T2,zi,Z2) = f ^ f \{T,z)dTdz (A6) 
where n represents the cluster distribution marginalized over the probability distribution for measurements, i.e. 

n{T,z) = [ [ nid,^i{Tt,Zt)p{T,z\Tt,Zt)dTtdzt 

J zt JTt 
^ j j "•ideal (Tt, 2t)p 



1 + Zt 

l + z 



Tt,zt]p{z\zt) ( 1 dTtd^t, 



(A7) 
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Realistic T errors 


Worst-case T en'ors 


Eq. <A8t 


Eq. <A8t 




with std. dev. 3 X cry 



Table Al. Temperature error specifications. 



Parameter 


Description 


Realistic z en'ors 


Worst-case z errors 




Standard deviation at z = 


0.05 


0.10 


c 


Catastropliic standard deviation in units of (tq 


4 


4 


n 


Min. deviation from mean in units of ctro for catastropliic redshifts 


1 


1 


/cat 


Fraction of catastrophic redshifts 


0.05 


0.10 



Table A2. Redshift error specifications. 



where Zt and Tt are true redshift and temperature, and in the last step the relation Tobs = (1 + Zohs)T%/{l + zt) was used to go from 
observed to true temperature. The temperature and redshift measurement probabihty distributions are modelled by 



p{T\Tt,zt)AT = 



p{z\zt)Az = 



1 



/7r/2 (cTj, + CTy ) 



(JriTt, Zt) = 



exp 



1 (T - rmod(rt, Zt)) 

2' 



dr 



TmcdiTt, zt)/Tt = ftc + arTt + a^zt + QzzZ? + arrTt + QzTZtTt 

o-J = Tt (/3+ + /3+Tt + Ptzt + + /^JtT? + Ptr^tTt) ., 
a~ = Tt (/?- + /SyTt + /3jzt + K.zt + PttTx + /^Jr^tTt) 
where the a and /3 are determined from simulations (see Sect. 13.: 



(A8) 



Tt > r,nod(rt, Zt) 

otherwise. 



1 



(1 - /cat) exp 



1 (z-^t)' 



2a2(l + 2t)2 



(A9) 



/cat exp 



[zt) 



{z - Zt) 



2c2<72(l + z0 = 



fT0(l 



^t 



9 {\z — zt\ — ncaoil + zt)) Q[z) \ Az 
1 + erf I 



Zt 



NT\zt) = V27rcao(l + zt) erfc 



y2cro(l + Zt) 
1 

^ 2 



n \ 1 . 
— = I mm 



erfc 



Zt 



^^/2c(Jo(l + Zt) , 

The temperature error assumptions made in this work are described further in Sect. 13.51 and summarized in Table lAll The redshift error 
assumptions for parameters /cat, n, c and ctq are described further in Section [J!4l and Table lA2l Note that the probability distributions of true 
temperatures and redshifts, the Bayesian "inverses" of the above, are weighted by the cluster distribution and given by 



p(TtiT, zt)dTt 
p {zt\Tt, z) Azt 



p(TiTt,2t)nidcai(Tt,zt)dTt 
/p(T!r',2t)nidcai(T',zt)dT' ' 
p(z\zt) nidcai (Tt , Zt )d2:t 

/p(z|z') "ideal (rt,z')dz' ■ 



(AlO) 
(All) 



A2 Expected likelihood 

In order to evaluate the expected constraints from a survey, one needs to consider some ensemble of possible outcomes and from that 
calculate, by ensemble averaging or otherwise (given a specification of 'expected'), the expected constraints. We have chosen a type of 
smoothed Maximum Likelihood (ML) estimate, that captures the most likely shape and size of constraint contours but removes the offset 
associated with a traditional ML point estimate. In the following we show in detail that our expected constraints can be obtained accurately 
without averaging over many data realizations, but rather by using only an 'average catalogue'. 

Having an expression for the single-catalogue likelihood, we seek to estimate the expected constraints for the survey. We define this as 
the expected constraints for a set consisting of a certain fraction e most likely catalogues. We start by setting up some formalism and prove 
our central theorem, and then go on to use this for our application. 

Definition 1. Let {Cj} denote a set of catalogues indexed by j. Let A'' be the number of bins of a catalogue. Let A*'; or be the observed 
number count for bin i, in catalogue j where superscript present. Let A; be the Poisson mean for bin i at which the likelihood is evaluated, 
and A* the same for the fiducial model used to generate the catalogues. Let = — A* measure the deviation of the observed number 
count from the fiducial-model mean. 
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Definition 2. Let the expected likelihood for the fraction e most likely catalogues in a Poisson ensemble be given by 



NA 



(A12) 



where the product runs over the A'^ bins in a catalogue, and {■}^ denotes a Poisson ensemble average restricted to catalogues Cj such that 
"^^j P{Cj)0(P{Cj) — Ps) = e (with B the Heaviside step function). This expression also defines the probability threshold Pe. 

Corollary 1. It follows from the above definition and the Poisson distribution that 



0(P(C,)-Pe) = e- 1 J2 n Q{P{C^) - Pe) ■ 



Definitions. Let 



C± = \Si e {[Ail - a:, LA*J - A*} Vi} 



(A13) 



(A14) 



the set of catalogues consisting of the 2^^ catalogues between the most likely catalogue (for which 6i = |_A*J — A* Vi) to the catalogue with 
probability Pe (for which 6i = [A*] — A| Vi). Here, [•] and [-J are the ceiling and floor operators respectively. 

Remarlc 1. The choice of this set of catalogues will be convenient and is suitable to define a smoothed ML estimate. 

We now come to the central theorem: 

Tlieorem 1. For the catalogue set C±, 

■ const. 



{^}e = E (r^n - In Ai - Ai + O {S') + < 

[irough 



Proof. The probabiUty level e for the catalogue set C± can be estimated through 

.T.(Ar)L^*J 



n^< 



We approximate 



[All! -2^e-S.A* 



g 2^^- Ei A- TT (A|) • 
llr(l + A|) 



(A15) 



(A16) 



(A17) 



where we have used the gamma function as a continuation of the factorial, effectively extending the Poisson distribution to the gamma 
distribution for non-integer values of Ni, something we will use throughout. We can now write 

J [(A, +d/)!J 

where the catalogues (indexed by j) are now restricted to those in C±. To proceed, we first take the logarithm of the likelihood to separate 
out the catalogue-set-dependent normalization, which is of no consequence for our discussion. We can thus write 



ln(£>^ = -iVln2-h^[-Ai-hA|lnAi-hlnr(l-KA*)]+lnE, 

i 

where we have defined 

(A.A*)^ 



Taylor expanding in 5^ (since \Sj \ < 1 for our catalogues) we find 
ln(£), = -JVln2 + V[A*lnAi-Ai + lnr(l + A*)l+lnE| 



(A19) 



(A20) 



5' + 



-E 

i,j,k,l 



d^E 1 dE dE 



E V dSUSl E dS'- dSi 



6=0 



where "5 = 0" denotes 6^ = 0\/i,j. Inserting E and the derivatives 



(AfcAfc) 



si 



dSfdSl 



(A21) 

(A22) 
(A23) 
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where I (A, A*) = In (AA*) — 2^' (1 + A*) (the digamma function \& coming from the factorial as gamma function), we obtain 

ln{£)^ = ^(AnnA,-A,) + ^^(A,,A*)2-^^5^ + 

i i j 

i J2 S"'^ XmXk, XiyS.i - 2-^^(A„ A*)^(Afc, A^)] 5{si + O {5^) 

i,j,k,l 

= In A, - AO + ^ £{K A*)2-^ + 



(A24) 



i^^(A„A*MAfe,A:) 



where 5ij is the Kronecker deha. We can evaluate the (5-sums using our knowledge of the set of catalogues C±: 



3 



= 2' 



(rAn + LA*j-2An = 2^ (^-"^ ■ 



22iV 

2iv 



a: - - AJ - - 



a: A^ - i (A* + AIO + i 



[A* a: + A* (A^ - 1) + (A- - 1) a: + (A- - 1) (Afe - 1)] = 2^ 



A'Ai, --(a* + a:; 



(A25) 

(A26) 
(A27) 
(A28) 



where we have defined A* = [A*] — A* and excluded the possibility that [A^] = [A^J = XX - Inserting l lA27l l and ( IA28l l in ( IA25b we find 
that the second-order term is zero due to cancellation between its two constituent terms. Hence, also inserting ( IA26t . we finally arrive at 



□ 



J2 A*lnA,-A.+^(A„A*) (^A*-0 



E 



[A*! _ - InA. - A, + A* - - (In A* - 2*(1 + A*)) 



+ 0(5-' 



(A29) 
(A30) 



The theorem states that a good approximation to {£) is given by using A'^; — [A*] — 1/2 in a single-catalogue likelihood C. This 
expression, however, does give rise to an offset in the best-fitting values away from the true means, associated with shot noise. As we are 
using the catalogue construction as a way of defining a meaningful expected likelihood which is not just an arbitrary point estimate, we are 
not really interested in this offset (and would like to separate it from sources of bias); rather the variance is what concerns us. Therefore, we 
propose using the very similar expression 



(In/:) = ^ (A* In A, -AO 



const. 



(A31) 



The best-fitting values for A; of this expression are equal to the true means A* . However, how do the standard deviations compare? The 
standard deviations are given by 

(T,,,; = [A*] - 1/2 , (Jnican,, = \/A* , (A32) 

where ae,i is the standard deviation of equation ( IA30I ) and (Tmoan.i the standard deviation of equation ( |A31| >. Upper and lower limits for their 
ratio can then be given as 

1 a^can., 1 ^^33^ 



+ 1/2A* ^1 - 1/2A* 

It is clear that for A* < 1 the relative error will become large as A* decreases. Again, this is due to shot noise. One could always make bins 
large enough that at least a few elements fall in each bin, ensuring only moderate relative errors in the standard deviations. Such a binning 
might however not be optimal or even close to, and thus reflect the underlying distribution poorly. It appears that no general conclusion can 
be drawn here. However, if we specify a dependence Xi = XKO/O*)""^ for the Ai's on some parameter 9, as is typically the case and certainly 
here, we can write the following: 

/ fl \ 

(A34) 



(In C) — (A* In Xi — AO + const. =ln6 A* — '^M a*" ) ^ const 



[A*! - - )lnA,-A, 



+ const. =lne^aj [A*! - -) - E ^ ) + 



const. 



(A35) 



Clearly, the only difference between In (£)^ and (In C) comes from the difference in the first sum. Naively, we would not expect this to differ 



© 0000 RAS, MNRAS 000, 000-000 



28 M. Sahlen et al. 




0.96 0.97 0. 



Figure Al. The probability density function for s^^i for a typical XCS catalogue with ai £ U (—5, 5). 



much between the two cases, particularly for a binning that represents the distribution well. What would be the expected value? Consider the 
following quantity: 



(A36) 



One would generally expect that ( [A*] — A*) G (7(0, 1) or at least a similarly symmetric distribution across the bins, so that ( [A*] — A*) = 
1/2. We thus expect 

= ' ' = = 1 . (A37) 

For typical XCS catalogues, even if we assign uncorrelated random exponents a^, the probability distribution for Sroi is quite generally very 
sharply peaked at or close to Srd = 1. An example is shown in Fig. lAll for which a; G U (—5, 5). Furthermore, finding typical 's for the 
various XCS models, we find that Sroi = 1 + 0(10"'^). 

In conclusion, the likelihood QiiC) of the average catalogue is a good approximation to the average likelihood In (C)^ of our set of 
catalogues C±, and can also generally be expected to be a good approximation in other similar applications. We have confirmed this by 
explicitly comparing to the likelihoods for a Poisson sample of catalogues, as shown in Fig.|6]in the main text. 



This paper has been typeset from a TgX/ MTj^C file prepared by the author. 
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